MyAnimeList, or MAL for short, provides detailed information for anime and also allows users to create an individualized list of all the anime they have watched. Various parts of this list are customizable, including the ability to give shows a score from 1 to 10.

MyAnimeList page for Fullmetal Alchemist: Brotherhood

While browsing around MAL I noticed something interesting that is present on some user’s profiles – Everyone has the ability to list their geographical location. Moreover, you can search for users by location. This got me wondering if I could collect data from users in a region that I specify.

MyAnimeList user search

Yup! If I enter “USA” I am given a list of users which have that listed as their location. And there are a lot. On the user search page up to 24 users are displayed at a time. As far as I know there is no way to do such a search and show all users at once, so if I want to gather data on everyone in a search I will need to scan through every page. Luckily for me, the URL of these searches is pretty easy to manipulate.

https://myanimelist.net/users.php?cat=user&q=&loc=USA&agelow=0&agehigh=0&g=&show=24

Here is the URL for page 2 of a user search with the region “USA” specified. Each page contains 24 user listings, so page 2 has the extension “&show=24”, which means “show listings 24 through 48.” Page 3 would have the extension “&show=48” and so on.

To download all of these pages I used a Chrome extension called Simple Mass Downloader. In the Download List tab I set a pattern URL that looks like this:

https://myanimelist.net/users.php?cat=user&q=&loc=USA&agelow=0&agehigh=0&g=&show=[0:24000:24]

Basically that extra bit at the end specifies that we should download all of the pages between 0 and 24000, skipping by 24. So we download pages 0, 24, 48, 72, 96, all the way to 24000. In the event that there aren’t that many pages/users, the download just returns a “Failed – No file” message. Doing this with the region USA specified gave me 726 pages (That’s over 17,000 users!) in under a minute.

Now that I have those search result pages, I need to extract the usernames. I wrote a program in C# to read the html files and find the usernames. I noticed that usernames are listed in profile image links as such:

<div class="picSurround"><a href="/profile/jbax1899"><img class="lazyload" data-src="https://cdn.myanimelist.net/images/userimages/4206707.jpg?t=1602991200" border="0" width="48"></a></div>

So to pull a username I can look for the beginning part of that line and copy what comes right after right up until the closing double quotation marks. Here’s the code I wrote to do just that:

string[] filePaths = /*FILE DIRECTORY HERE*/
string searchFor = "\"picSurround\"><a href=\"/profile/";
foreach (string filePath in filePaths)
{
	StreamReader file = new StreamReader(filePath);
	string line;
	while ((line = file.ReadLine()) != null)
	{
		int index = line.IndexOf(searchFor);    //find where the username should be
		if (index != -1)
		{
			string username = "";
			for (int i = index + (searchFor.Length + 1); i < line.Length; i++)   //record username, up until the first double quotation mark
			{
				if (line[i] != '\"')
					username += line[i];
				else
					break;
			}
		}
	}
}

This will build a list of usernames out of the html search pages I provided.

Now I want to pull anime scores from the lists of these users. I found jikan.net, a C# wrapper, to be a perfect fit for this project. Jikan.net offers functions like “GetUserAnimeList” and “SearchAnime” to quickly and easily access the MAL database.

I created a WinForms application for ease of use. Now I can easily view the usernames collected, specify an anime to search for in user’s lists, watch it run, and view the results. You can download it here to use yourself.

One restriction I had to deal with is the rate limit of MAL’s API. I could get away with 1 query per second most of the time, but it was more safe to limit myself to 1 query per two seconds. When pulling up anime lists for hundreds of users, this all ends up taking quite a while. For the sake of time I decided to record up to 20 scores per region before stopping and moving on to the next.

Now that I have the anime scores I can create some interesting visualizations. This map was created with the US heat (choropleth) map from amCharts. An interactive version is available here.

There wasn’t too much variance in the data – All scores were between 7.5 and 9, and the average was about 8.3. There doesn’t seem to be a particular pattern either. Perhaps the anime I chose in this example was a bad pick as it is one of the most loved in the entire medium, so the scores would always be decent on average. Even if the visualization isn’t very enlightening, I think it is an interesting proof of concept for what can be done with data gathered from MyAnimeList.

Leave a comment