Ranking Research Institutions Based On Relative Academic Conferences
Hello everyone I'm laylita bobbi and I work head until mmm this is a side project that we worked on with a friend of mine. Yasin who wasn't able to make it here today so I'm presenting on behalf of us both to serve reef interview overview of the contests was that we all know after this workshop of the importance of influential nodes and how it can be applied in different applications. So we're focused on academia and in the field of big data mining specifically and we were using the publication's from the recent years from big data mining conferences to predict and find the most effective and research institutions in the field of big data mining. So as you may all know one challenge here was that the results of the 2016 conferences were unknown at the time of the competition. So we didn't have any ground truth to evaluate our methods and our result against and there was also big data challenges that we had to overcome which we did by the use of Hadoop and spark and we were using Microsoft academic graph as our database for the papers of the conference's so the team's approach was to find the scores of all the athletes in this using the same scheme that was provided by the k DD cup across the recent years from 2011 and two until 2015 and apply a rank aggregation method across those rankings to provide a final and unified ranking for the 2016 submissions and the acceptances of the institutions. So as I mentioned the problem was that we didn't know what that ranking is so we didn't know what ranking what rank aggregation technique would get us closer to that final ranking so what we did instead was that we used 2015 results as a test so we evaluated our rank aggregation techniques from 2011 to 2014 and saw what which technique gets us closer to the 2015 ranking and then apply the same thing across those full years to provide us with the ranking for 2016. So the rank aggregation methods that we looked at included board account which is an intuitive election method and in which the voters assign scores to each of the candidates based on their preference and this measures are based on how many number of candidates our place after un line.
So then you use an aggregation method method just it can be a median geometric mean or a mean that you provide for all of those different voters and all of those different rankings to come up with the final and unified ranking overall another another method that we looked at was a an algorithm and in pegan algorithm given a number of rank lists. You do a random act you do sorted access on each of the lists in parallel and for all of those items you do a random access to all the other at all the other lists to fetch those items and you do a sum and that will provide you with a final ranking as well the the iteration stops when you've reached the k top items and then you stop and you have europe your final ranking in our final and proposed algorithm. Which is the one we ended up using is that you simply find the normalized scores for each of those rankings. For each of those years and do a sum for each of these institutions across the years and that will give provide you up with the final ranking so I have some results here and these are the NDC jeje at 20 values for the different conferences in a different phase and as you can see here is the proposed algorithm which is the sum of the normalized scores has better results and better scores comparing to board account and Fagin which were the alternatives so it's the same for Phase two. You can see here that it's generally higher for the proposed algorithm and phase 3 as well so that's the technique that we decided to go with and so in in general using the affiliations and finding the scores the application scores and then retin irrigating the rankings across the recent years to provide a final ranking for 2016 was our approach. And thank you very much.
So then you use an aggregation method method just it can be a median geometric mean or a mean that you provide for all of those different voters and all of those different rankings to come up with the final and unified ranking overall another another method that we looked at was a an algorithm and in pegan algorithm given a number of rank lists. You do a random act you do sorted access on each of the lists in parallel and for all of those items you do a random access to all the other at all the other lists to fetch those items and you do a sum and that will provide you with a final ranking as well the the iteration stops when you've reached the k top items and then you stop and you have europe your final ranking in our final and proposed algorithm. Which is the one we ended up using is that you simply find the normalized scores for each of those rankings. For each of those years and do a sum for each of these institutions across the years and that will give provide you up with the final ranking so I have some results here and these are the NDC jeje at 20 values for the different conferences in a different phase and as you can see here is the proposed algorithm which is the sum of the normalized scores has better results and better scores comparing to board account and Fagin which were the alternatives so it's the same for Phase two. You can see here that it's generally higher for the proposed algorithm and phase 3 as well so that's the technique that we decided to go with and so in in general using the affiliations and finding the scores the application scores and then retin irrigating the rankings across the recent years to provide a final ranking for 2016 was our approach. And thank you very much.