Ask the Primo Expert – Relevancy Mechanisms
Hello to everyone my name is ill dudwin male. I am the quality of search team leader in primo. Today's webinar is about relevance ranking. I hear a few questions that are users starting point to going to certain areas in the system. That's going to relevant. I'll go over the question it's answer in this way and then I'll also extend the question. Two things related to that area itself. Our library harvests our local institution repository in primo more than 100 150 thousand records because we repository also acts as an institutional geography not all records of this repository have got a full text. We would like to make records with full text more visible than the records with no Felix is an heir you anyway to make records with full text more relevant that's records with no full text attached and i will look at this in as this is a question of how do we let's say make more relevant or boost up collections or Rick was in the system in self in primo myself. Okay so i will take this as a point to what can we do. And so in fact we have three ways in a certain way it depends on what the where we're searching in essence what our scope is but if we take that question of that we have a collection and we want to boost certain things in that collection in that in our local collection which is first of all ways to booster or the ranking booster in the normalization rules i. I want to well talk about it for one. Then we'll show just how it works in the UI and so what we can do. If we know where our what collection we want to look like in this point. We know that we're talking about any solution. Big log raphy that the that not all the records but we do know where those that have full things come from and we can we understand where the records from the phottix arrive. Whether they don't arrive what we can do in the normalization rules we can go to the ranking a section and there to the booster add a boost to the records with normalization rules of course to what we want to boost up to what to those types of records.
So i'll just quickly go and we'll go to the back office here of a different environment here. Okay okay so we have a we have a. I'll take care. Oh Connie University r-ala cinemark. Ok we're going to this normalization rules we have here our ranking section okay in this ranking section of course i can add here the booster okay boost 1 plus 2 and what. I can do in booster one is I can sit here. A rule in the normalization rules that for the certain records that come from certain data source that come from whatever rules it from the collections you can identify that those you want to boost up and and the and set the constant and sent this value to the certain boost that. I want to give him now. One means that knows boost is really done. It's a multiplier that means it'll boost the score of those records those documents with one and nothing happens and so we normally generally don't say the the we recommend not to put your very large numbers but mostly two three four and around those numbers single nubs quite blow it around it to multiply it by two or three now there is no real recommend is what the number should be because it really depends on the data itself on the data. That's inside the in the collection the corpus over here and so the best thing here to do of course is to in your staging and firing or in the end testing to test it. Take the certain record a record or records that you want to do change this new arrays in rule i would suggest starting with two and a little bit going a bit higher up running the normalization rules indexing and seeing how the how it affects on those specific is it's just a collection lettuce or whatever that came from that you would like to boost up and that is well one of the. Let's just say that out the one of the simple way to when you have a certain collection that you know where you want to come. We're to boost it up in. Yeah that's the first one that's the first one that. I wanted to show the second one of course is the blending now the blending of or we're not talking about in the institutions are existing in the in your your collection.
It's when you're starting to blend with other other collection with third nodes when you want to collect with then make a third parties you know earth third parties deep searches. WorldCat EBSCO primo central. You want to start blending that but you want to make sure that or a certain of those signals are boosted higher are more relevant or your collection will still be stronger than whatever you're blending together and I've where were these what we like to blend and get from everything but we want to make sure that our institution is stronger so again I'll go back here to the back office starts on the beginning again and search engine configurations and what we have here we have our blending here our force parameters of blending okay and here i can decide on which engine which deep search or which engine i would like to now forster blending now what does it mean force a blending here because I'm now getting two types of records from two different engines a I want to decide which engine is more important or which engine is less important and what I can do here I can start a I'm I can for I I can sorry I can decide if we're I would like certain results to be and so if we start from a local engine normally this is the main our recommendation would start from i am now let's say a blending local surgeon with prima with the prima central and i would like my local search engine for artists my local searching results to be higher up and so we're blending again and we're back to the blending so i'm here wanting to blend i'm blending local search engine with prima central. I want my local results to be stronger higher relevant more relevant than the promotion to the moral into the world get to the EBSCO I want them to be seen higher up in the results and what I can do here is I can first of all give a constant factor before I go.
That's possible comes in fact what this means. This means that i will give a constant boost to all a to all results to come from my local search engine automatically boosting them up take their score prima central at the score prima local at the score i will multiply prima local my prima local collection results with for that automatically and then. I will have them. Let's say we'll see who's the winner between those results which is more relevant even though I've given a constant if that's not still enough and I want to still make sure that uncertain searches where I know that lets say prima central worlds that are stronger than my local. Because of the data there and i would like to anyhow have my local data higher up results and i can start the force plenty and the force blending is that i have here three parameter parameters the first one is saying when. I want to now combine minimum 10 for combining I want to now combine my local results. Where would I want them to be combined with the primo central with the world pet results. Where would I with them to be well. I like them to be high medium alone and this in fact what this means high me alone is where in the first 10 results would I like it in the first to result in around the five results in the middle or a lower down around. I think it's seven and the and that's where I decide where I want my in a certain such as well. I've got the constant factors of anyhow boosts up my results but I want to make sure anyhow in certain switches that I will see in actually I made a mistake. Sorry this is the the location this is just the. What's the score. Excuse me sorry. I will jump to that for a second. I will come back what. I said before the top center and bottom is the location where I put the results. This will be the first to around the first to this will be in the fifth and it will be around the seven and the medium hint rank for combining. I'm sorry that was the difference. This is to say. How high do we want the local nature to be the score to mix.
It took two to make it any how to combine. Its location. What do I mean again if I have score. Let's say a primo central is 10 and the local score. Its core is to do. I still want to a combined. It to the top location to put in the second place even though it's very low it's 2 and all the prima central is around 10 because they're very relevant and the prima local is wrong too because not that relevant and what I do here with this. Hit rank for combining i say i want to a the threshold to allow to come is low. Even though it's too i will still want my local collection to come high medium will say no you know what. I don't want very low ones to come if it's not a very relevant ranking even though it's in my local collection I still do not want it higher up and I won't wear and I want to boost it up. I won't put it in the second place I needed to be around medium. Let's save the top is 510. I would like it. If you're at least 5 the score is five of courses and just the these numbers are just for illustration but the scoring there. I would like it to be in the medium results from the highest record here that I have for result and then I will put it in the top place and high of course only I want very relevant very high scores to be boosted up if it's not a high scores it's not a relevant score. I will not want used to combine the local results with the p.m. central. I don't want it up ok I want only really relevant ok and here you can play around with this with this paramedic to see how much you want the local caching to anyhow key boost it up. The last one of course is how much I want to reward and this is the number of results from local that I will that I will that I will boost up to its places. So if let's take our combining that are our example. We have a minimum hand minimum hint hit rank below. We want to put in the top location and three that means even if all my primo primo local results have a score of 2 and primo.
Central has 10. I will anyhow put in the first in the top place which is the second place I i will give the number results of three. And i'll start putting a three records of local. I will push them up above the prima central and put them from top from the third place and merge them with the other ones from prima central a three means how much i will boost up if they're not strong of course if the prima local is strong enough it's really really relevant right just like prima central was relevant. Then a then you will still see of course the local in its place and will boost up to other three below it you will get local come up and will be blended together with it and we'll get the score. This is just to reward ones. That aren't strong enough. That don't beat. The don't win are not relevant like in score like the primo central the other third node and we want to loose them up so this is our second one of course. This is not only local if the if the main question was about local but we can do it with anyone we can blend here when we have we created a few scope which is blended the WorldCat and ethical WorldCat and primo central local and work with whatever the combinations you can use here to to blender to decide which of those engines are stronger and which of those are weaker and you can play around here before I continue to the next one which was the Constitution. I want to make an important point here on the the blending here there's a parameter that I've seen that hasn't been used all that not everyone use it and it's quite an important parameter. Hey I will go to mapping tables I will search here deep search in deep search plugin parameters there's a plugin parameter that is called premium service primo rank and it's sometimes false sometimes true it depends on the let's do take a a WorldCat local some time to it subsides false it's very it's what happens if this is false what happens it's true if this is false what will happen the blending is not as it's not is not it we don't rank the the we don't rank the world cut or they go that deep search results that returned we don't give them a new rank we don't blend them with our rank of the local let's say we've blended let's do the thing again like let's say we've ranked we've done prima we have a local search engine.
I'll go back here we are going to blend local search engines with worldcat now if we haven't turned on for WorldCat the ranking and it's false what will happen is that we don't give a rank in the in in pin your prima we will do a search to the local and get results will do a search to WorldCat and get those results and the blending because there is no there is no we haven't rear ank we haven't give a ring to well-kept based on our local search engine what will happen. Normally what you'll see is that will be merged to page page you'll get a page of prima local the next page worldcat next page prima local next page will get and because there is no ranking in a certain way you differ it there is no relevant relevant ranking score between them because WorldCat can return with the score which is connected to work there could be at analyst anus thousands. That's the score that's it's a it's in its. I've lost the word. It's it's the size they see the sizing of the score there is they have between a thousand at 10 and 1000 is their scorer scale and primo is between 0 and 1 and so there is no nothing here to compare between them and so the blending doesn't work as well because the rankings and that's why you'll see page page but if you go and if you turn on like we said before if you go and you turn it on and you turn it to true what that means and normally you will see that the results are better blended when it becomes true what it does it gives a rank to the WorldCat to the deep search is search. It'll ruin world kept return it get the result and then great in in technical terms will create a mini index or what it does in fact it gives them a rank it ranks them as in the same relevance in the same scale as a prima local and then when you rank them and when you blend them the scores are similar.
The scores have the same scam. We're easier to pin it and then you'll get the blending arm is not. You'll see it'll be much better little bit now of course if you wanted in the way of a in how it looks like with the page babe let's of course of it but I do suggest that it's much it's better for the black oil for the to the blender for blending to turn on the ring to true okay the last and the last one if we want to boost think this is the institutional boost it's mainly used in consortiums or when you have certain institutions. You can purview. A you can boost the collections boost records based on the institution's collection. Okay and what do I how does this work. There's two things that will need to be done. First of all records need to have a in the delivery section they have to have the institution code. There has to be an institution filled with institution code. Okay and that's the first one. In the delivery section the panics has to be in the su shun it uses that institution that code over there that the code of the institution to understand who is connected to the institution and the second place is to turn it on. And where do we turn this on in a month. Alexa exactly what it does but we'll look over here in the views with it. Let's take a not a quite sure. I am in a consortium with white fur and volcanoes Microsoft a socialist. So what is a loon. Quip absolute very tangled in office. And they will come in here to the tabs. Configurations will go to the brief display this brief results and here i have here institution boost. So where do i go again. I'll just want to show it again I went I'll do it I'll just do it again. I'll just slower again unless I go to volcano edit. I go a scope list and tabs tiles in the tiles I will go to the brief display and the best of spray I go to the locations in locations that what I do.
Sorry not locations a brief assault in brief results. If we go to what we want to do is to brief results we go to below below here and what we have for the view boosts resolve from my institution if we check this parameter and then we deploy what will happen then. What will happen that when you search now in this consortium in the distances which has from have many institutions here in your view and of course like we said before in your peenics you have and in the other peenics as we have to have the whole consortium should have in the peenics in the delivery section have an institution filled with the institution code what we'll do right now it'll boost up the institution you're the Scopes institution or the view. This is a the volcano booster volcano. And in a way you can play lower down the other one so that the volcano will be more it will be more visible will be ranked higher and in fact when you go into the reef. Let's say the white shore they go back to the main here to the if I go to a volcano that we saw will boost them up. And if i go to the white shore i will see if it turned on. Of course the white show will be stronger will be relevant will be higher up ranked than the volcano one. If it's not turned on then of course the the the regular ranking will always be on all the records and there will be blended you know between its kind of blending between all the all the institutions over there a but that's one way if you want it that's the third way in the control to get the local of course you can also do again the booster that we saw before but take into it to infect it any institution into the booster and then you're all blending on the same thing so it's better to use when you're in the consortium and in there you have any institution to institution boost to allow yours to be stronger to be really to be higher up in the results. Okay so to sum up a the three thing that we said are three ways that we talked about to a to allow it to to boost up to how to allow certain collection certain records certain institutions or like this implement certain engines to be ranked.
Higher is the booster that we have in the normalization rules that we can set to a collection with the rule to those collections. To those records that we know that we want to boost up blending of course between the engines themselves and institutional rules inside the consortium itself which is mainly it's between institutions themselves. And so exactly we'll go to the next one. If one adds to or enhanced terms in the cinemas disease have any influence on the relevance of ranking example if the terms search for is allergy and the simulator has four appendages allergy amateur are these four alternatives equally treated as relevant as at the term allergy. Which of course is the original term and so the so. I will talk about sinners in itself completely again. But let's answer the question in itself first of all today. The synthesis does not treat the cinnamon as an equal to the original term. We do see the original term as the more important term it's a in fact in all in all the ranking a recall and the ranking that is done in primo the original term is always stronger than anything else that is added as strong as a synonym if stronger than stemming as we'll see later on it's always a struggle because we take we see the original term of what the user entered as the more the more important now in the synonyms. If you do set it to very high it will be close not equal but it will be very close in its ranking in itself the center itself so. I just want to just go to what did I say. The scene is set to very high. We'll go back to our here we'll go to the beginning search engine configuration in a certain recreation we have here and also get the synonyms. Okay this is the score itself. Hey sorry there's enough. I want to go this is this course over here. Here's our score of how we want to give each one in itself but if we want to change the one second i want to give you the squirts of the thing.
Where is it the digitally to do i lost my page. One okay oh. I have my notes back that's what I want. Sorry moment closed and i will want to open my English. Let's open the English saloons our success I can show you hold it from here okay. One second here we go so i will go to where we have where we have our cinema files and our cinnamon files. Let's take that'll buff english. The ones that aren't much nice to see okay so we have here yogurt to yogurt and it's very high okay and it's low and what we have here is as you can see we can set which ones we want here very high low medium like we saw in the search and configurations. You can say what the boost over there is in itself normal. Okay now of course you can set the parameters of what you would like it to be how much it'll give it and of course you should have whatever you have here very high should be higher in the configurations in the back office but once you put very high and you put it as close as possible as strong as you that is that you will get closer but of course like. I said the system doesn't allow the cinnamon to be exactly like the original terms. Seen again as we do not see it as the as the original term. It'll always be closer even whatever you put your just so put it as high as opposed to put at the very high if you want the things to be nearly equal okay. If it's something that it's likely so over here that we want that in our question over here they want these four things to be very ear very close it is so put them is very high and give them a relatively high score a closer as 21 and that will make it as closer as possible as we can it to the original term. A the reason that they're just as a quicker than the reasons we have here. These different types of a high-low loan world is of course there are many different kind of things that are similar. Ok that the that come like over here there that that can be a cinnamon and some are closer related.
Some are less related and even and when you have a lower one it'll won't oh it will be added. Only if it's you know of course it'll be added with a lower booster. We'll come up if you don't have anything that's connected to that word then you will. Something will come up that is connected lower. We're not that's higher in relevance connected but still connected so you will still be able to get so you can have a certain hierarchy of how related to the original word. Your cinema is okay and so the ones that are nearly the same we should put them very high and the ones that are close likely as we see here we have here the Roman numerals that are similar that they're similar but normally we don't get people would want to if we normally let's say 11 the number 11 we'll come over here we have here 11th and 11 in itself is very high but if you write 11 you would like also the word 11 to be or the opposite so it's very high it is the same in fact as you can say the same thing it's not exactly and i just want to show you in a moment just an example why I do not see it as exactly the same if we have here 16th century this is an example. I look for 16th century lovely ok but 16th century if I go back here for a moment i have here. I do have here. A synonym with 16th century. Ok now 16th century this is the journal maybe that I wanted to look forward these journals in itself and if I right here the 16th 16th century I will want this to come up maybe for this. I not sure and that's normally what I want to explain. I would not want. Maybe and that's the station. We don't work with true capisci but we want to make sure that the user what he wrote is what he wants in itself the 16th century that he wrote over here is this will come up in itself and not the record that we saw before it isn't in scopus of a as you can see as we go down lower down. We all have things that have 16th century and things like that but we do take the users terms are more important than the cinema itself because that could be exactly what he wanted and if I start giving him the ones are exactly correct is equally to it but it's not what he wrote and so even though I will recall it and if we go further in week 0 find that journal itself if right here journal as you can see.
It came up 16th century journal. That's the result. I don't have a 16th century general with the number 16 in itself but this I do so here the sudden help me return the result that I wanted even though I didn't write and here the cinnamon gave what I wanted. Okay and we can figure that here where the cinnamon came into play very nicely and returned even though I didn't return III did exactly the same thing but like our own just to exercise 16th century. If you just wrote this this could be what he wanted and they say where other things the journal. F would come up before my sixth engineer that I had before ok so the r users terms are very important to us. We don't want to lose the users terms we want to use the user term and also add other ones but not as equals but as a bit less. I would also explain a bit about the synonym self. How they work the synonyms are also only added as in the keyword search and not use when boosting phrases and what do I mean here synonyms are per word. That's how use them cook in it. That has primo uses it. It means that when. I try when when there are algorithms that are create creating phrases and ranking algorithms with boosts difference to the whole phrase the synonym doesn't take part in that part it doesn't a tape puck in those ranking it's a the ranking that it comes up it made me works on the key word itself sets if in other words. I want to talk about Rico. I would like when I write 16th century the synonym of 16 century will just give the recall that I will have it back here. It will be in my 1236 results. I do have the 16th century in my results. It over here. In ranking of course the 16th century will take more precedence on that one because it's what the user like I said before it is what the user wrote and that's why I use it now the journal even though as we see it's the first result here ok and in all these remember it was exclusive up here because of the ranking itself it was because that was only result that was here and still in the recall.
It's still in my results if that. I have over here okay. And it's very important to understand because the syndrome is how they work based on what you write and watching. It's here to help the recoil to help not lose things that were that exist in your corpus. Yeah and in that way. They helped our algorithm but not in all the ranking other are they use certificate in a moment. I will go to where we do. Use the things that are similar in the algorithm and that will be in fact it will lead me to the next question a but just to sum it up in itself as we saw over here. They're nearly but not equal always the user and so I'll go into this one that in fact stems from that question stemming cinemas inflections for lesson what and. When does this do or this happen. When do these happen. When are these done oops sorry. Excuse me sorry okay. So all these words. There were many cases in the last few years that we've had people know that they're stemming. I can have stemming in my system there was a feature that with the loss to the beginning of last year was taken that was their inflections on the title. There's correlation that goes on there is the Simmons. When are these used. Okay and what do they do. What do they mean. Okay so i will check 3a i will turn them into three things. Synonym is based on the sink on the file and there are single terms. It's always added to the system. We always have symptoms when you search if you have a synonym and it's in your file. It will be added to the search like we said it's mainly helps the recall and to have your the result that you won't lose result even though of course you only wrote the same amount of other in itself.
Okay and that's what we spoke about. Before saving an employee relation inflections okay. Similar and pluralization is what we call stemming is inside stemming which stemming just as a brief what stemming means is that. I take the word and I bring it to its stem word. Playing becomes play a sound cats. Become cat okay not all of these in factoria innings of the Kansas pluralization but I will change what that procession will turn into a polarization. The other one is stemming. I'll take it to its stem form. I want to change playing it to play or place to play. I want to bring to the stem form so that if I searched in you know at with the with a plural or happily or web I also wanted to become to attempt for happy and find and stay and find results based on that okay. Polarization of course is the opposite direction. I will take a cat and turn it into. Can't i will add a polarization where i can where it's more understood and I'll pluralize the word now this system where all of you if I want to go back to that cause most people know it from here the search engine configuration. We have it here in. I love resolve threshold. Ok there's the maximum results for stemming ok. Hey what you know over here. This is the 25 results out of the box which means if I have less than 25 results my maximum results is less than 25. I will activate swimming Alvarez esteeming a mechanism and the semi can represent. The idea. Here is that you you haven't had there is there is no that you don't have enough results. Your search yielded in very minimum amount of results and I would and the system will now like to stay. Men pluralize your query to be able to increase the amount of results. And maybe the results that you wanted were in fact in that results is that you didn't get from your search results and that's what. I mean here based on the number of results okay and it helps to recall and facts. It's all the fields. That's one of the things that stemming is important understand when stemming is activated because of course the results we stem all the fields that you said now of course if you searched only title then only the title.
Beast in that depends on the search. I only search the title stemming will occur in the studying but if I searched in any okay in all the fields then stemming when it's activator will stem all the fields and I will get a my query if it's if we go to our fronting that we have here for our use if I search cat dogs I in fact if I searched cat dogs a is whoops sorry. Sigh search. Can't a dog in and read. I have this a wide life. Ok sorry listen uh. I the exam so on but this is the example where you can see the cats dogs and raised well so I'll get to this around so it's not my example. I one that I went to this result this one. I found one that helps us. Is this okay what i have here i have here the title as cats but if i go into the details i can see that what is a cat okay in fact in itself of course that's exactly what is the cat baby okay. The description was also stemmed over here. Baby two babies that's the fertilization. Okay cat two cats here in fact is what's the cat but here's the baby that i was looking for. It happens on everything. I have here anywhere in the record. And when you search anywhere in the record that means that the polarization same will happen in everything and this is important. And why am i explaining this importantly because everywhere any field that you are now searching if it's also local fields Ellis Ellis ours and things like that and you have local field. There will also be stemmed in your search and you're searching on them and if you have certain things inside there and the stemming will also step inside there and you also get results based on that and only not only on the main title also subject descriptions that are normally you can say their user search everything this all the fields will be will be stamped and what happens a lot of times in that musics many results return for neck because everything is stem.
Didn't tough and one of the things that help us. And then this leads me to the next one which is the inflection. I inflection is the main that we give something. But in fact it's very similar to stemming pluralization it's based on the stemming and the plural Asian infection what we call it inflections that we do a bit more stemming and pluralization that the that mechanism does there something that aren't completely then. There's certain stemming and certain polarization that aren't then there that are done an inflection that's why it's got the different name but the inflections the main reason I have a name is it helps us the ranking and reflections. Why is this important this that what has what this does this is done only on the title field. Okay it's not like the stemming done on everything and it's not based on the number is also it always occurs in fact it always occurs more than one term sorry. I'll just say it had one. Term instructions aren't activated stemming will be activated based on on the number of results that you have but inflections only happens on more than one term. And what infections does it works only on the title field and it has an important thing. What stemming doesn't have it enforces that all the query terms are in the title and this is quite important because if here even though in fact my example now that I see it's not the greatest example. Because i have here what is a cat and babies are all in the description. What can happen is. I'll have such a thing here. I can again cat and the scripture. You have a description also baby but what happens is the stemming will is done per word and it's done on all the word all the fields and all the words and so what can happen you can find cats in stemmed in the title and description if we just remove let's say in babies in the description and we'll be in different fields completely you'll find baby in one place cat in another place stemmed and will be mixed around and does of course this is like i said before the stemming is important to help the recall if you have not enough results you do want to get more results up it could be but we miss something could be that it wasn't found and the stemming and the polarization helps you and will increase the result and that's why it's based on the results that you've done and but it's on all the fields and it's a keyword search it's a keyword expansion expense everywhere on every field when we talk about inflection.
I'm going to inflict only the words in the title and if we go back to the when I head here before what I want to show if I hear a half cat. Doc and rated wildfire wildlife ok. I'm going to even though I might have a a what I've done here. What happens here. It works only on the title so i can expand i can stem. I can pluralize. I can do many interesting things over there with the cat dog in the related wildlife. Of course the end is not there use. Its stop wouldn't offer to remove when we try to return or rank in this specific area but what we do we can add here cats and we can hear dogs and we'll search them only in the title and we won't get other results with inflectional mechanism results based on cats in the description based on cats in local field. 10 based on cats in by mistake in I don't know in a different field. This inflections will help our ranking because it enforces that all the words and all the stemming the preservation that was done happens only in the title and this way what we do we allow only if the important only the more relevant results of what you searched with their in reflections with the polarization stemming to be ranked higher okay if I remove this for a moment one second I will come back to here for their. I want to show you here. Sound of music okay. I search the sound of music I hear may result the sound of music. The sound of music okay. These are very nice results on a music. Of course the stop words like we said before are less.
That's why even without the death this can be returned okay but if I do hear the sounding of music I don't really have the sounding of music as you can see here. I do have sounding and music in this in. Wantage in the fifth result sounding music. I have here in my title. Okay and it's okay but it's further off as you can see it's not in the phrase itself the sound of music the sounding of music at the founding off rhythm music. It's a bit less than what the user right wrote here but overhear what i have here is the sound of music and it's very similar to what i wrote over here. I don't have something else that is written the sounding of music or similar to that in my corpus that's why I didn't get the result itself and what play deflections it and of course like I said before it's only on the title. It removed the ing and it said okay. I want to now look for sound of music especially some of you but only in the title and see it as a phrase and what you got here is that this relevant result was higher up this relevant with higher up all these results over here the sound of music or higher than this sounding of music because there are more closer and they help the user to get closer to what him even though he didn't write the correct a to correct the form the stemmed or the pluralization and here is how the inflection when I enforced it only on the title I didn't get many other results like the stemming which alter ego I helped here the ranking in itself and it's a very very nice and very strong thing that we have here a and okay those are the these are two these two things so it's summer. Summarize those two questions that we have the synonyms and here. And like we said the synonyms are similar to the stemming but are always done and they are near the user query itself but not exactly all will be a bit less than a user query and the stemming and and of course you can set it very high or the level that you want it to be close and to help your recall with that synonym in itself is stemming and pluralization of course is the stemming based on the number result but on all the fields which helps with a Rico but the inflections similar does the same thing as down improvisation but only on the title on all the words have only on the title and all the words that you've stemmed and the user did have to the user queried all has to be in the title themselves and of course I just want to add the last thing that's not linear more than one word inflections does not occur in one word of course we don't want to start getting many results like spending even on the title on every search that we do okay.
What are the key parameters of the relevance in dream center. Now this is a very very it can be a very complicated question. There's a lot of parameter that means. I've I will do the key parameter like written here and not everything that isn't people settle payments in many things. Come into there are many a little parts. But i'll do the key ones over here. A there are two. They're the main factors like i said there are many better main factors. I will jump between soon. There's the document boost which is the boost that is given to a document that is in primo central and the mound will go and to the query parameters. What is done with the curry okay. The document post is as stated it's the boost that I give to a certain document that exists in the primo central corpus. Okay and what that means is that I see and it's also exists. Awesome primo local. Not in this not these parameters of course but the idea of documents exist in primo local you can see the search configurations which is give the documents which has a date which are ferber. There's different kind of power me today that you can give a document you can say which document is more important than another document and of course what were what we mean here. We have two documents both of them are. Let's say global warming and the known example.
I have two documents from global warming but one document is peer-reviewed as the first parameter is here that we have of a document booth and one of them is not and simple state of that I will see the peer-reviewed and is understood the innocent plane. That's a peer-reviewed this document is more important than the non peer-reviewed documents okay and and that's the idea here now of course if I search the document that is not peer-reviewed and it's titled of global warming and in his author is it'll that been my ear and that's what I searched that document even though it's not peer-reviewed and has a less document will come up because that's what I wanted. Okay that was the the the record that. I was searched but when you have I such global warming and you have many documents that you have the return from p.m. Central you now have to say which documents are more important than the other ones that will be boosted higher and here are the key parameters at what say what the that comprised of an important document. We have a document which is peer-reviewed is said to be more important. It's citations then some okay. It's not. The citation has a certain impact but not as one because okay. It's important it's but of course newer things have less citation so it's an important structure but not of course the strongest in itself but it is important journal usage then the sending of the importance of this journal how much is journal is user is apparent that is there is it that is is the is it that that is the aggregator then and is used here to give journals that are more used a more important booth and of course the dates the the newer it is it gets a certain date it gets us it get some more in it gets the more boosting of course the date is in fact it's in a kind of it's a decayed and it doesn't it is not for every as you go back in time the Boosters list but it still gets a boost in itself okay so these are the main characters we again the peer-reviewed is more important makes an important a citation general usage and the date these come together to a certain number of boosts that is given to a document okay based on these parameters and that document becomes more point now you have course have documents that have all these four parameters to documents have all these four factors inside the document booth but if it's a newer it'll probably a bit more stronger if it's journal usage memorial get more strong even though it's late is a bit less there's a certain amount of leverage that goes between them and so that's the side of the document but the other side of course is the search itself when we're searching in primo central there are many parameters in the query that we use to help us get the relevance the the relevant documents up so of course few boosting we boost the title in itself and the author and the subject and the description and as you go to different and they have different strengths the types of the strongest after that is the subject and then the author and in this description okay and that helps us when you search even though you have a subject but the title that's your title it'll be stronger the type where the proximity of the user's query terms in the document itself of course the closer they are the stronger is a match is okay though the further they are in the title this further on the subject in the description as bigger it is it will be less important okay and those two things are very important for the query to help us create these queries these algorithms we have entity recognition we have to understand the user searches an author he has an author initiative he searched JK Rowling harry potter with a non-reflective p.
m. central but it's a nice example that I like we asked understand that JK Rowling is an author there and search it in the author field and give it understanding that if you have JK Rowling in you're in you're in and your query you want authors that have JK Rowling and not JK Rowling answer type you also want a hero the title but of course we understand dance and also will have it higher up ok citations to recognize citations and of course if you put dates and the last thing is of course expansions just like we talked about before are synonyms and the spelling and the stem the provision the inflections all these are part of the relevance in primo central atom primo to help us bring more results even as the user of course they use are the most important that's the most important i want to say but the expansion's help us to get more results even if it's not exactly what they use it is now all these things come together to help us create and they all have their strengths but all these fields the query parameters and the documents come together to give us the key parameters of what we do to return things from primo central and and so to summarize this whole thing that we have over here in are things that we spoke mainland the user what he wrote and the so on to it and is the most important thing that we want to see because we know that most users right what they want to look for and everything else around is to help the user find the things that did you mean that we didn't talk about but did you in the expansions of spending the center of their all here to help find what he wants but the user itself we hope that he knows what he's searching for and that's why we give him more credit than other things that we do that's it I hope it was understood it was that we covered exactly covered a lot of things over here and that's it.
I hope good luck to you all and good day bye bye.
So i'll just quickly go and we'll go to the back office here of a different environment here. Okay okay so we have a we have a. I'll take care. Oh Connie University r-ala cinemark. Ok we're going to this normalization rules we have here our ranking section okay in this ranking section of course i can add here the booster okay boost 1 plus 2 and what. I can do in booster one is I can sit here. A rule in the normalization rules that for the certain records that come from certain data source that come from whatever rules it from the collections you can identify that those you want to boost up and and the and set the constant and sent this value to the certain boost that. I want to give him now. One means that knows boost is really done. It's a multiplier that means it'll boost the score of those records those documents with one and nothing happens and so we normally generally don't say the the we recommend not to put your very large numbers but mostly two three four and around those numbers single nubs quite blow it around it to multiply it by two or three now there is no real recommend is what the number should be because it really depends on the data itself on the data. That's inside the in the collection the corpus over here and so the best thing here to do of course is to in your staging and firing or in the end testing to test it. Take the certain record a record or records that you want to do change this new arrays in rule i would suggest starting with two and a little bit going a bit higher up running the normalization rules indexing and seeing how the how it affects on those specific is it's just a collection lettuce or whatever that came from that you would like to boost up and that is well one of the. Let's just say that out the one of the simple way to when you have a certain collection that you know where you want to come. We're to boost it up in. Yeah that's the first one that's the first one that. I wanted to show the second one of course is the blending now the blending of or we're not talking about in the institutions are existing in the in your your collection.
It's when you're starting to blend with other other collection with third nodes when you want to collect with then make a third parties you know earth third parties deep searches. WorldCat EBSCO primo central. You want to start blending that but you want to make sure that or a certain of those signals are boosted higher are more relevant or your collection will still be stronger than whatever you're blending together and I've where were these what we like to blend and get from everything but we want to make sure that our institution is stronger so again I'll go back here to the back office starts on the beginning again and search engine configurations and what we have here we have our blending here our force parameters of blending okay and here i can decide on which engine which deep search or which engine i would like to now forster blending now what does it mean force a blending here because I'm now getting two types of records from two different engines a I want to decide which engine is more important or which engine is less important and what I can do here I can start a I'm I can for I I can sorry I can decide if we're I would like certain results to be and so if we start from a local engine normally this is the main our recommendation would start from i am now let's say a blending local surgeon with prima with the prima central and i would like my local search engine for artists my local searching results to be higher up and so we're blending again and we're back to the blending so i'm here wanting to blend i'm blending local search engine with prima central. I want my local results to be stronger higher relevant more relevant than the promotion to the moral into the world get to the EBSCO I want them to be seen higher up in the results and what I can do here is I can first of all give a constant factor before I go.
That's possible comes in fact what this means. This means that i will give a constant boost to all a to all results to come from my local search engine automatically boosting them up take their score prima central at the score prima local at the score i will multiply prima local my prima local collection results with for that automatically and then. I will have them. Let's say we'll see who's the winner between those results which is more relevant even though I've given a constant if that's not still enough and I want to still make sure that uncertain searches where I know that lets say prima central worlds that are stronger than my local. Because of the data there and i would like to anyhow have my local data higher up results and i can start the force plenty and the force blending is that i have here three parameter parameters the first one is saying when. I want to now combine minimum 10 for combining I want to now combine my local results. Where would I want them to be combined with the primo central with the world pet results. Where would I with them to be well. I like them to be high medium alone and this in fact what this means high me alone is where in the first 10 results would I like it in the first to result in around the five results in the middle or a lower down around. I think it's seven and the and that's where I decide where I want my in a certain such as well. I've got the constant factors of anyhow boosts up my results but I want to make sure anyhow in certain switches that I will see in actually I made a mistake. Sorry this is the the location this is just the. What's the score. Excuse me sorry. I will jump to that for a second. I will come back what. I said before the top center and bottom is the location where I put the results. This will be the first to around the first to this will be in the fifth and it will be around the seven and the medium hint rank for combining. I'm sorry that was the difference. This is to say. How high do we want the local nature to be the score to mix.
It took two to make it any how to combine. Its location. What do I mean again if I have score. Let's say a primo central is 10 and the local score. Its core is to do. I still want to a combined. It to the top location to put in the second place even though it's very low it's 2 and all the prima central is around 10 because they're very relevant and the prima local is wrong too because not that relevant and what I do here with this. Hit rank for combining i say i want to a the threshold to allow to come is low. Even though it's too i will still want my local collection to come high medium will say no you know what. I don't want very low ones to come if it's not a very relevant ranking even though it's in my local collection I still do not want it higher up and I won't wear and I want to boost it up. I won't put it in the second place I needed to be around medium. Let's save the top is 510. I would like it. If you're at least 5 the score is five of courses and just the these numbers are just for illustration but the scoring there. I would like it to be in the medium results from the highest record here that I have for result and then I will put it in the top place and high of course only I want very relevant very high scores to be boosted up if it's not a high scores it's not a relevant score. I will not want used to combine the local results with the p.m. central. I don't want it up ok I want only really relevant ok and here you can play around with this with this paramedic to see how much you want the local caching to anyhow key boost it up. The last one of course is how much I want to reward and this is the number of results from local that I will that I will that I will boost up to its places. So if let's take our combining that are our example. We have a minimum hand minimum hint hit rank below. We want to put in the top location and three that means even if all my primo primo local results have a score of 2 and primo.
Central has 10. I will anyhow put in the first in the top place which is the second place I i will give the number results of three. And i'll start putting a three records of local. I will push them up above the prima central and put them from top from the third place and merge them with the other ones from prima central a three means how much i will boost up if they're not strong of course if the prima local is strong enough it's really really relevant right just like prima central was relevant. Then a then you will still see of course the local in its place and will boost up to other three below it you will get local come up and will be blended together with it and we'll get the score. This is just to reward ones. That aren't strong enough. That don't beat. The don't win are not relevant like in score like the primo central the other third node and we want to loose them up so this is our second one of course. This is not only local if the if the main question was about local but we can do it with anyone we can blend here when we have we created a few scope which is blended the WorldCat and ethical WorldCat and primo central local and work with whatever the combinations you can use here to to blender to decide which of those engines are stronger and which of those are weaker and you can play around here before I continue to the next one which was the Constitution. I want to make an important point here on the the blending here there's a parameter that I've seen that hasn't been used all that not everyone use it and it's quite an important parameter. Hey I will go to mapping tables I will search here deep search in deep search plugin parameters there's a plugin parameter that is called premium service primo rank and it's sometimes false sometimes true it depends on the let's do take a a WorldCat local some time to it subsides false it's very it's what happens if this is false what happens it's true if this is false what will happen the blending is not as it's not is not it we don't rank the the we don't rank the world cut or they go that deep search results that returned we don't give them a new rank we don't blend them with our rank of the local let's say we've blended let's do the thing again like let's say we've ranked we've done prima we have a local search engine.
I'll go back here we are going to blend local search engines with worldcat now if we haven't turned on for WorldCat the ranking and it's false what will happen is that we don't give a rank in the in in pin your prima we will do a search to the local and get results will do a search to WorldCat and get those results and the blending because there is no there is no we haven't rear ank we haven't give a ring to well-kept based on our local search engine what will happen. Normally what you'll see is that will be merged to page page you'll get a page of prima local the next page worldcat next page prima local next page will get and because there is no ranking in a certain way you differ it there is no relevant relevant ranking score between them because WorldCat can return with the score which is connected to work there could be at analyst anus thousands. That's the score that's it's a it's in its. I've lost the word. It's it's the size they see the sizing of the score there is they have between a thousand at 10 and 1000 is their scorer scale and primo is between 0 and 1 and so there is no nothing here to compare between them and so the blending doesn't work as well because the rankings and that's why you'll see page page but if you go and if you turn on like we said before if you go and you turn it on and you turn it to true what that means and normally you will see that the results are better blended when it becomes true what it does it gives a rank to the WorldCat to the deep search is search. It'll ruin world kept return it get the result and then great in in technical terms will create a mini index or what it does in fact it gives them a rank it ranks them as in the same relevance in the same scale as a prima local and then when you rank them and when you blend them the scores are similar.
The scores have the same scam. We're easier to pin it and then you'll get the blending arm is not. You'll see it'll be much better little bit now of course if you wanted in the way of a in how it looks like with the page babe let's of course of it but I do suggest that it's much it's better for the black oil for the to the blender for blending to turn on the ring to true okay the last and the last one if we want to boost think this is the institutional boost it's mainly used in consortiums or when you have certain institutions. You can purview. A you can boost the collections boost records based on the institution's collection. Okay and what do I how does this work. There's two things that will need to be done. First of all records need to have a in the delivery section they have to have the institution code. There has to be an institution filled with institution code. Okay and that's the first one. In the delivery section the panics has to be in the su shun it uses that institution that code over there that the code of the institution to understand who is connected to the institution and the second place is to turn it on. And where do we turn this on in a month. Alexa exactly what it does but we'll look over here in the views with it. Let's take a not a quite sure. I am in a consortium with white fur and volcanoes Microsoft a socialist. So what is a loon. Quip absolute very tangled in office. And they will come in here to the tabs. Configurations will go to the brief display this brief results and here i have here institution boost. So where do i go again. I'll just want to show it again I went I'll do it I'll just do it again. I'll just slower again unless I go to volcano edit. I go a scope list and tabs tiles in the tiles I will go to the brief display and the best of spray I go to the locations in locations that what I do.
Sorry not locations a brief assault in brief results. If we go to what we want to do is to brief results we go to below below here and what we have for the view boosts resolve from my institution if we check this parameter and then we deploy what will happen then. What will happen that when you search now in this consortium in the distances which has from have many institutions here in your view and of course like we said before in your peenics you have and in the other peenics as we have to have the whole consortium should have in the peenics in the delivery section have an institution filled with the institution code what we'll do right now it'll boost up the institution you're the Scopes institution or the view. This is a the volcano booster volcano. And in a way you can play lower down the other one so that the volcano will be more it will be more visible will be ranked higher and in fact when you go into the reef. Let's say the white shore they go back to the main here to the if I go to a volcano that we saw will boost them up. And if i go to the white shore i will see if it turned on. Of course the white show will be stronger will be relevant will be higher up ranked than the volcano one. If it's not turned on then of course the the the regular ranking will always be on all the records and there will be blended you know between its kind of blending between all the all the institutions over there a but that's one way if you want it that's the third way in the control to get the local of course you can also do again the booster that we saw before but take into it to infect it any institution into the booster and then you're all blending on the same thing so it's better to use when you're in the consortium and in there you have any institution to institution boost to allow yours to be stronger to be really to be higher up in the results. Okay so to sum up a the three thing that we said are three ways that we talked about to a to allow it to to boost up to how to allow certain collection certain records certain institutions or like this implement certain engines to be ranked.
Higher is the booster that we have in the normalization rules that we can set to a collection with the rule to those collections. To those records that we know that we want to boost up blending of course between the engines themselves and institutional rules inside the consortium itself which is mainly it's between institutions themselves. And so exactly we'll go to the next one. If one adds to or enhanced terms in the cinemas disease have any influence on the relevance of ranking example if the terms search for is allergy and the simulator has four appendages allergy amateur are these four alternatives equally treated as relevant as at the term allergy. Which of course is the original term and so the so. I will talk about sinners in itself completely again. But let's answer the question in itself first of all today. The synthesis does not treat the cinnamon as an equal to the original term. We do see the original term as the more important term it's a in fact in all in all the ranking a recall and the ranking that is done in primo the original term is always stronger than anything else that is added as strong as a synonym if stronger than stemming as we'll see later on it's always a struggle because we take we see the original term of what the user entered as the more the more important now in the synonyms. If you do set it to very high it will be close not equal but it will be very close in its ranking in itself the center itself so. I just want to just go to what did I say. The scene is set to very high. We'll go back to our here we'll go to the beginning search engine configuration in a certain recreation we have here and also get the synonyms. Okay this is the score itself. Hey sorry there's enough. I want to go this is this course over here. Here's our score of how we want to give each one in itself but if we want to change the one second i want to give you the squirts of the thing.
Where is it the digitally to do i lost my page. One okay oh. I have my notes back that's what I want. Sorry moment closed and i will want to open my English. Let's open the English saloons our success I can show you hold it from here okay. One second here we go so i will go to where we have where we have our cinema files and our cinnamon files. Let's take that'll buff english. The ones that aren't much nice to see okay so we have here yogurt to yogurt and it's very high okay and it's low and what we have here is as you can see we can set which ones we want here very high low medium like we saw in the search and configurations. You can say what the boost over there is in itself normal. Okay now of course you can set the parameters of what you would like it to be how much it'll give it and of course you should have whatever you have here very high should be higher in the configurations in the back office but once you put very high and you put it as close as possible as strong as you that is that you will get closer but of course like. I said the system doesn't allow the cinnamon to be exactly like the original terms. Seen again as we do not see it as the as the original term. It'll always be closer even whatever you put your just so put it as high as opposed to put at the very high if you want the things to be nearly equal okay. If it's something that it's likely so over here that we want that in our question over here they want these four things to be very ear very close it is so put them is very high and give them a relatively high score a closer as 21 and that will make it as closer as possible as we can it to the original term. A the reason that they're just as a quicker than the reasons we have here. These different types of a high-low loan world is of course there are many different kind of things that are similar. Ok that the that come like over here there that that can be a cinnamon and some are closer related.
Some are less related and even and when you have a lower one it'll won't oh it will be added. Only if it's you know of course it'll be added with a lower booster. We'll come up if you don't have anything that's connected to that word then you will. Something will come up that is connected lower. We're not that's higher in relevance connected but still connected so you will still be able to get so you can have a certain hierarchy of how related to the original word. Your cinema is okay and so the ones that are nearly the same we should put them very high and the ones that are close likely as we see here we have here the Roman numerals that are similar that they're similar but normally we don't get people would want to if we normally let's say 11 the number 11 we'll come over here we have here 11th and 11 in itself is very high but if you write 11 you would like also the word 11 to be or the opposite so it's very high it is the same in fact as you can say the same thing it's not exactly and i just want to show you in a moment just an example why I do not see it as exactly the same if we have here 16th century this is an example. I look for 16th century lovely ok but 16th century if I go back here for a moment i have here. I do have here. A synonym with 16th century. Ok now 16th century this is the journal maybe that I wanted to look forward these journals in itself and if I right here the 16th 16th century I will want this to come up maybe for this. I not sure and that's normally what I want to explain. I would not want. Maybe and that's the station. We don't work with true capisci but we want to make sure that the user what he wrote is what he wants in itself the 16th century that he wrote over here is this will come up in itself and not the record that we saw before it isn't in scopus of a as you can see as we go down lower down. We all have things that have 16th century and things like that but we do take the users terms are more important than the cinema itself because that could be exactly what he wanted and if I start giving him the ones are exactly correct is equally to it but it's not what he wrote and so even though I will recall it and if we go further in week 0 find that journal itself if right here journal as you can see.
It came up 16th century journal. That's the result. I don't have a 16th century general with the number 16 in itself but this I do so here the sudden help me return the result that I wanted even though I didn't write and here the cinnamon gave what I wanted. Okay and we can figure that here where the cinnamon came into play very nicely and returned even though I didn't return III did exactly the same thing but like our own just to exercise 16th century. If you just wrote this this could be what he wanted and they say where other things the journal. F would come up before my sixth engineer that I had before ok so the r users terms are very important to us. We don't want to lose the users terms we want to use the user term and also add other ones but not as equals but as a bit less. I would also explain a bit about the synonym self. How they work the synonyms are also only added as in the keyword search and not use when boosting phrases and what do I mean here synonyms are per word. That's how use them cook in it. That has primo uses it. It means that when. I try when when there are algorithms that are create creating phrases and ranking algorithms with boosts difference to the whole phrase the synonym doesn't take part in that part it doesn't a tape puck in those ranking it's a the ranking that it comes up it made me works on the key word itself sets if in other words. I want to talk about Rico. I would like when I write 16th century the synonym of 16 century will just give the recall that I will have it back here. It will be in my 1236 results. I do have the 16th century in my results. It over here. In ranking of course the 16th century will take more precedence on that one because it's what the user like I said before it is what the user wrote and that's why I use it now the journal even though as we see it's the first result here ok and in all these remember it was exclusive up here because of the ranking itself it was because that was only result that was here and still in the recall.
It's still in my results if that. I have over here okay. And it's very important to understand because the syndrome is how they work based on what you write and watching. It's here to help the recoil to help not lose things that were that exist in your corpus. Yeah and in that way. They helped our algorithm but not in all the ranking other are they use certificate in a moment. I will go to where we do. Use the things that are similar in the algorithm and that will be in fact it will lead me to the next question a but just to sum it up in itself as we saw over here. They're nearly but not equal always the user and so I'll go into this one that in fact stems from that question stemming cinemas inflections for lesson what and. When does this do or this happen. When do these happen. When are these done oops sorry. Excuse me sorry okay. So all these words. There were many cases in the last few years that we've had people know that they're stemming. I can have stemming in my system there was a feature that with the loss to the beginning of last year was taken that was their inflections on the title. There's correlation that goes on there is the Simmons. When are these used. Okay and what do they do. What do they mean. Okay so i will check 3a i will turn them into three things. Synonym is based on the sink on the file and there are single terms. It's always added to the system. We always have symptoms when you search if you have a synonym and it's in your file. It will be added to the search like we said it's mainly helps the recall and to have your the result that you won't lose result even though of course you only wrote the same amount of other in itself.
Okay and that's what we spoke about. Before saving an employee relation inflections okay. Similar and pluralization is what we call stemming is inside stemming which stemming just as a brief what stemming means is that. I take the word and I bring it to its stem word. Playing becomes play a sound cats. Become cat okay not all of these in factoria innings of the Kansas pluralization but I will change what that procession will turn into a polarization. The other one is stemming. I'll take it to its stem form. I want to change playing it to play or place to play. I want to bring to the stem form so that if I searched in you know at with the with a plural or happily or web I also wanted to become to attempt for happy and find and stay and find results based on that okay. Polarization of course is the opposite direction. I will take a cat and turn it into. Can't i will add a polarization where i can where it's more understood and I'll pluralize the word now this system where all of you if I want to go back to that cause most people know it from here the search engine configuration. We have it here in. I love resolve threshold. Ok there's the maximum results for stemming ok. Hey what you know over here. This is the 25 results out of the box which means if I have less than 25 results my maximum results is less than 25. I will activate swimming Alvarez esteeming a mechanism and the semi can represent. The idea. Here is that you you haven't had there is there is no that you don't have enough results. Your search yielded in very minimum amount of results and I would and the system will now like to stay. Men pluralize your query to be able to increase the amount of results. And maybe the results that you wanted were in fact in that results is that you didn't get from your search results and that's what. I mean here based on the number of results okay and it helps to recall and facts. It's all the fields. That's one of the things that stemming is important understand when stemming is activated because of course the results we stem all the fields that you said now of course if you searched only title then only the title.
Beast in that depends on the search. I only search the title stemming will occur in the studying but if I searched in any okay in all the fields then stemming when it's activator will stem all the fields and I will get a my query if it's if we go to our fronting that we have here for our use if I search cat dogs I in fact if I searched cat dogs a is whoops sorry. Sigh search. Can't a dog in and read. I have this a wide life. Ok sorry listen uh. I the exam so on but this is the example where you can see the cats dogs and raised well so I'll get to this around so it's not my example. I one that I went to this result this one. I found one that helps us. Is this okay what i have here i have here the title as cats but if i go into the details i can see that what is a cat okay in fact in itself of course that's exactly what is the cat baby okay. The description was also stemmed over here. Baby two babies that's the fertilization. Okay cat two cats here in fact is what's the cat but here's the baby that i was looking for. It happens on everything. I have here anywhere in the record. And when you search anywhere in the record that means that the polarization same will happen in everything and this is important. And why am i explaining this importantly because everywhere any field that you are now searching if it's also local fields Ellis Ellis ours and things like that and you have local field. There will also be stemmed in your search and you're searching on them and if you have certain things inside there and the stemming will also step inside there and you also get results based on that and only not only on the main title also subject descriptions that are normally you can say their user search everything this all the fields will be will be stamped and what happens a lot of times in that musics many results return for neck because everything is stem.
Didn't tough and one of the things that help us. And then this leads me to the next one which is the inflection. I inflection is the main that we give something. But in fact it's very similar to stemming pluralization it's based on the stemming and the plural Asian infection what we call it inflections that we do a bit more stemming and pluralization that the that mechanism does there something that aren't completely then. There's certain stemming and certain polarization that aren't then there that are done an inflection that's why it's got the different name but the inflections the main reason I have a name is it helps us the ranking and reflections. Why is this important this that what has what this does this is done only on the title field. Okay it's not like the stemming done on everything and it's not based on the number is also it always occurs in fact it always occurs more than one term sorry. I'll just say it had one. Term instructions aren't activated stemming will be activated based on on the number of results that you have but inflections only happens on more than one term. And what infections does it works only on the title field and it has an important thing. What stemming doesn't have it enforces that all the query terms are in the title and this is quite important because if here even though in fact my example now that I see it's not the greatest example. Because i have here what is a cat and babies are all in the description. What can happen is. I'll have such a thing here. I can again cat and the scripture. You have a description also baby but what happens is the stemming will is done per word and it's done on all the word all the fields and all the words and so what can happen you can find cats in stemmed in the title and description if we just remove let's say in babies in the description and we'll be in different fields completely you'll find baby in one place cat in another place stemmed and will be mixed around and does of course this is like i said before the stemming is important to help the recall if you have not enough results you do want to get more results up it could be but we miss something could be that it wasn't found and the stemming and the polarization helps you and will increase the result and that's why it's based on the results that you've done and but it's on all the fields and it's a keyword search it's a keyword expansion expense everywhere on every field when we talk about inflection.
I'm going to inflict only the words in the title and if we go back to the when I head here before what I want to show if I hear a half cat. Doc and rated wildfire wildlife ok. I'm going to even though I might have a a what I've done here. What happens here. It works only on the title so i can expand i can stem. I can pluralize. I can do many interesting things over there with the cat dog in the related wildlife. Of course the end is not there use. Its stop wouldn't offer to remove when we try to return or rank in this specific area but what we do we can add here cats and we can hear dogs and we'll search them only in the title and we won't get other results with inflectional mechanism results based on cats in the description based on cats in local field. 10 based on cats in by mistake in I don't know in a different field. This inflections will help our ranking because it enforces that all the words and all the stemming the preservation that was done happens only in the title and this way what we do we allow only if the important only the more relevant results of what you searched with their in reflections with the polarization stemming to be ranked higher okay if I remove this for a moment one second I will come back to here for their. I want to show you here. Sound of music okay. I search the sound of music I hear may result the sound of music. The sound of music okay. These are very nice results on a music. Of course the stop words like we said before are less.
That's why even without the death this can be returned okay but if I do hear the sounding of music I don't really have the sounding of music as you can see here. I do have sounding and music in this in. Wantage in the fifth result sounding music. I have here in my title. Okay and it's okay but it's further off as you can see it's not in the phrase itself the sound of music the sounding of music at the founding off rhythm music. It's a bit less than what the user right wrote here but overhear what i have here is the sound of music and it's very similar to what i wrote over here. I don't have something else that is written the sounding of music or similar to that in my corpus that's why I didn't get the result itself and what play deflections it and of course like I said before it's only on the title. It removed the ing and it said okay. I want to now look for sound of music especially some of you but only in the title and see it as a phrase and what you got here is that this relevant result was higher up this relevant with higher up all these results over here the sound of music or higher than this sounding of music because there are more closer and they help the user to get closer to what him even though he didn't write the correct a to correct the form the stemmed or the pluralization and here is how the inflection when I enforced it only on the title I didn't get many other results like the stemming which alter ego I helped here the ranking in itself and it's a very very nice and very strong thing that we have here a and okay those are the these are two these two things so it's summer. Summarize those two questions that we have the synonyms and here. And like we said the synonyms are similar to the stemming but are always done and they are near the user query itself but not exactly all will be a bit less than a user query and the stemming and and of course you can set it very high or the level that you want it to be close and to help your recall with that synonym in itself is stemming and pluralization of course is the stemming based on the number result but on all the fields which helps with a Rico but the inflections similar does the same thing as down improvisation but only on the title on all the words have only on the title and all the words that you've stemmed and the user did have to the user queried all has to be in the title themselves and of course I just want to add the last thing that's not linear more than one word inflections does not occur in one word of course we don't want to start getting many results like spending even on the title on every search that we do okay.
What are the key parameters of the relevance in dream center. Now this is a very very it can be a very complicated question. There's a lot of parameter that means. I've I will do the key parameter like written here and not everything that isn't people settle payments in many things. Come into there are many a little parts. But i'll do the key ones over here. A there are two. They're the main factors like i said there are many better main factors. I will jump between soon. There's the document boost which is the boost that is given to a document that is in primo central and the mound will go and to the query parameters. What is done with the curry okay. The document post is as stated it's the boost that I give to a certain document that exists in the primo central corpus. Okay and what that means is that I see and it's also exists. Awesome primo local. Not in this not these parameters of course but the idea of documents exist in primo local you can see the search configurations which is give the documents which has a date which are ferber. There's different kind of power me today that you can give a document you can say which document is more important than another document and of course what were what we mean here. We have two documents both of them are. Let's say global warming and the known example.
I have two documents from global warming but one document is peer-reviewed as the first parameter is here that we have of a document booth and one of them is not and simple state of that I will see the peer-reviewed and is understood the innocent plane. That's a peer-reviewed this document is more important than the non peer-reviewed documents okay and and that's the idea here now of course if I search the document that is not peer-reviewed and it's titled of global warming and in his author is it'll that been my ear and that's what I searched that document even though it's not peer-reviewed and has a less document will come up because that's what I wanted. Okay that was the the the record that. I was searched but when you have I such global warming and you have many documents that you have the return from p.m. Central you now have to say which documents are more important than the other ones that will be boosted higher and here are the key parameters at what say what the that comprised of an important document. We have a document which is peer-reviewed is said to be more important. It's citations then some okay. It's not. The citation has a certain impact but not as one because okay. It's important it's but of course newer things have less citation so it's an important structure but not of course the strongest in itself but it is important journal usage then the sending of the importance of this journal how much is journal is user is apparent that is there is it that is is the is it that that is the aggregator then and is used here to give journals that are more used a more important booth and of course the dates the the newer it is it gets a certain date it gets us it get some more in it gets the more boosting of course the date is in fact it's in a kind of it's a decayed and it doesn't it is not for every as you go back in time the Boosters list but it still gets a boost in itself okay so these are the main characters we again the peer-reviewed is more important makes an important a citation general usage and the date these come together to a certain number of boosts that is given to a document okay based on these parameters and that document becomes more point now you have course have documents that have all these four parameters to documents have all these four factors inside the document booth but if it's a newer it'll probably a bit more stronger if it's journal usage memorial get more strong even though it's late is a bit less there's a certain amount of leverage that goes between them and so that's the side of the document but the other side of course is the search itself when we're searching in primo central there are many parameters in the query that we use to help us get the relevance the the relevant documents up so of course few boosting we boost the title in itself and the author and the subject and the description and as you go to different and they have different strengths the types of the strongest after that is the subject and then the author and in this description okay and that helps us when you search even though you have a subject but the title that's your title it'll be stronger the type where the proximity of the user's query terms in the document itself of course the closer they are the stronger is a match is okay though the further they are in the title this further on the subject in the description as bigger it is it will be less important okay and those two things are very important for the query to help us create these queries these algorithms we have entity recognition we have to understand the user searches an author he has an author initiative he searched JK Rowling harry potter with a non-reflective p.
m. central but it's a nice example that I like we asked understand that JK Rowling is an author there and search it in the author field and give it understanding that if you have JK Rowling in you're in you're in and your query you want authors that have JK Rowling and not JK Rowling answer type you also want a hero the title but of course we understand dance and also will have it higher up ok citations to recognize citations and of course if you put dates and the last thing is of course expansions just like we talked about before are synonyms and the spelling and the stem the provision the inflections all these are part of the relevance in primo central atom primo to help us bring more results even as the user of course they use are the most important that's the most important i want to say but the expansion's help us to get more results even if it's not exactly what they use it is now all these things come together to help us create and they all have their strengths but all these fields the query parameters and the documents come together to give us the key parameters of what we do to return things from primo central and and so to summarize this whole thing that we have over here in are things that we spoke mainland the user what he wrote and the so on to it and is the most important thing that we want to see because we know that most users right what they want to look for and everything else around is to help the user find the things that did you mean that we didn't talk about but did you in the expansions of spending the center of their all here to help find what he wants but the user itself we hope that he knows what he's searching for and that's why we give him more credit than other things that we do that's it I hope it was understood it was that we covered exactly covered a lot of things over here and that's it.
I hope good luck to you all and good day bye bye.