Twitter has changed its API rate limit policy again. There are good and bad parts of this change, but the change effectively hamstrings some analytically important endpoints, which is unfortunate. What is an API? An API -- or Application Programming Interface -- is a protocol that programs use to interact with each other. There are different kinds of APIs, but the Twitter REST API (like most APIs) boils down to a standardized format in which developers can pose questions to an online platform and interpret the corresponding answers. For example, the Twitter API tells developers how to ask "Who follows @sigpwned?" and to expect a list of the IDs of the users who follow @sigpwned in return. What is a rate limit? Answering the questions programs pose to an API takes some computational power, especially for platforms as big as Twitter is. (Once you have more than 100M active users in your system, it takes some time to figure out who follows whom.) For this reason, an API will only answer so many questions from any one program before it stops answering questions from that program for a while. In this way, an API can make sure no one program takes up all its resources. This policy by which questions are answered or ignored based on usage is called an API's "rate limit." How did Twitter change its API Rate Limit Policy? The old rate limit policy was very simple: a user could pose 350 questions to the API per hour, and any questions after that were ignored until the current hour had passed. The hour-long windows started at the top of the hour, so if you asked 350 questions between 12:00 and 12:30, you had to wait until 1:00 before you could ask any more questions. As rate limits go, this rate limit policy was a good one. Sure, you could only ask 350 questions per hour, but you can get a lot of work done with 350 questions, and planning for different workloads wasn't too hard since all questions counted against the API rate limit the same way. This most recent change affected three key elements of Twitter's API rate limit policy: Rate limit windows are now 15 minutes instead of 60 minutes. New rate limits start every 15 minutes instead of every 60 minutes. This change doesn't affect much. Rate limits are counted per-question type instead of across all questions. Asking "Who follows @sigpwned?" does not affect whether Twitter will answer you when you ask "What lists is @sigpwned on?" The rates are counted separately. This change wouldn't affect much either, except for change #3. Rate limits for some question types have been increased 2x; rate limits for other question types have been decreased 6x. Because rate limits are counted per-question type now, users can ask significantly more questions of the API on an hourly basis than they could in the past. While this sounds good at first blush, it's not all upside since some endpoints got their allowed usage decreased. If you use a question with a decreased rate limit a lot, this definitely isn't good news. You can find the full list of question types and rate limits here, but here are the ones analysts will care about: Increased Rate Limit: "What public information has @sigpwned provided about himself?"; "What tweets has @sigpwned sent?"; "Who is on the list wcgworld/wcg-people?". Decreased Rate Limit: "Who follows @sigpwned?"; "Who does @sigpwned follow?" How does the change affect analytics? If you make regular use of API questions that just had their rate limit reduced, like W2O does, your life just got harder. For example, doing graph analysis on the accounts a group of users follow will now take about 6x longer, all things being equal. The reality is that if clients need data, they'll get data. With these rate limit changes, though, if a client deliverable requires data that takes longer to collect, the client deliverable may take longer to finish. Ultimately, managing these new API rate limits and keeping the trains running on time...
Twitter has changed its API rate limit policy again. There are good and bad parts of this change, but the change effectively hamstrings some analytically important endpoints, which is unfortunate.
What is an API?
An API — or Application Programming Interface — is a protocol that programs use to interact with each other. There are different kinds of APIs, but the Twitter REST API (like most APIs) boils down to a standardized format in which developers can pose questions to an online platform and interpret the corresponding answers.
For example, the Twitter API tells developers how to ask “Who follows @sigpwned?” and to expect a list of the IDs of the users who follow @sigpwned in return.
What is a rate limit?
Answering the questions programs pose to an API takes some computational power, especially for platforms as big as Twitter is. (Once you have more than 100M active users in your system, it takes some time to figure out who follows whom.) For this reason, an API will only answer so many questions from any one program before it stops answering questions from that program for a while. In this way, an API can make sure no one program takes up all its resources. This policy by which questions are answered or ignored based on usage is called an API’s “rate limit.”
How did Twitter change its API Rate Limit Policy?
The old rate limit policy was very simple: a user could pose 350 questions to the API per hour, and any questions after that were ignored until the current hour had passed. The hour-long windows started at the top of the hour, so if you asked 350 questions between 12:00 and 12:30, you had to wait until 1:00 before you could ask any more questions.
As rate limits go, this rate limit policy was a good one. Sure, you could only ask 350 questions per hour, but you can get a lot of work done with 350 questions, and planning for different workloads wasn’t too hard since all questions counted against the API rate limit the same way.
This most recent change affected three key elements of Twitter’s API rate limit policy:
How does the change affect analytics?
If you make regular use of API questions that just had their rate limit reduced, like W2O does, your life just got harder. For example, doing graph analysis on the accounts a group of users follow will now take about 6x longer, all things being equal.
The reality is that if clients need data, they’ll get data. With these rate limit changes, though, if a client deliverable requires data that takes longer to collect, the client deliverable may take longer to finish. Ultimately, managing these new API rate limits and keeping the trains running on time will require companies to get more clever about how they approach API usage.
Cleverness always implies an investment, whether it’s in time, money or both. It’s curious that Twitter wouldn’t simply offer a paid model for its API that would capture some of the investment these companies will now have to make and add it to their bottom line. Instead, this investment will just get “thrown away” into more and more complex processes and software.
While the reasons behind this decision are interesting to speculate about, the question of “Why?” is a different discussion.
EDIT: Podcast!
Because of the importance of the issue, we did a short podcast exploring the issue in more detail. You can find the link above.