Asterisk: Do you all want to briefly introduce yourselves?
Keith: Sure. I'm Keith Coleman, VP of product here at X and formerly Twitter. I've been here for about eight years. I used to run the overall consumer product development team and now focus on building Community Notes and other related things.
Jay: I'm Jay Baxter, a senior staff machine learning engineer at X. I was the original lead of the machine learning, voting, and reward model work on Birdwatch
and then Community Notes. Previously, I worked on recommender systems as part of Cortex Applied Research and have been at the company for ten years.
Lucas: I’m Lucas Neumann. I am a product designer. I worked on Community Notes at Twitter and then X for almost four years, and now I consult with the team on the project externally.
Emily: I'm Emily Thai. I was the embedded consultant from the University of Chicago Center for Radical Innovation for Social Change on Birdwatch and then the Community Notes team. RISC is a social impact incubator — using behavioral science to tackle social problems in unorthodox ways. We got introduced to the Community Notes team, to Keith, and provided a little bit of academic expertise and a perspective from outside of the tech world.
Asterisk: It's very exciting to have you all here. And I want to start right from the beginning. Where did the idea for Community Notes come from?
Keith: The idea came about around the end of 2019. It started with the observation that people wanted to get accurate information on social media, but it was really hard. There was obviously misleading information going around. The main approaches that companies were using were a combination of internal trust and safety teams deciding what was or was not accurate or allowed, or partnerships with professional media organizations trying to make those decisions. Both had three big challenges. One was speed — information moves really quickly on social networks and on the internet. It was really common for these trust and safety or fact checker decisions to take multiple days to check a claim, which is the equivalent of infinity in internet time.
Then there was a scale issue. It's really hard for these small groups of people to look at and review that stuff. And probably most importantly, even if you could deal with the speed and scale issues, there was still a fundamental trust problem. A lot of people just did not want a tech or media company deciding what was or was not misleading. So even if you could put labels on content, if people think it's biased, they're not likely to be very informed by it. Those problems were kind of obvious around this time. And we were wondering — what could actually solve them? How could you build some solution that could act at internet speed, at internet scale, and actually be trusted and found helpful by people from different points of view from across the political spectrum?
Pretty early on, it was obvious that crowdsourcing was a potential possible solution space. Wikipedia had obviously reached a massive scale. I think it's larger than any encyclopedia out there. It was fast. It would be updated within minutes, typically, when news stories changed. It had some challenges on the trust side and bias side. But we thought, you know, if we can overcome those, maybe that would work — that was kind of the origin of the concept. We prototyped a few different ideas for what that might look like. And one of those ideas showed a mockup, a prototype, depicting people on X — then, it was Twitter — submitting notes that could show on a post. The idea was that if the notes were reasonable, people who saw the post would just read the notes and could come to their own conclusion.
Asterisk: One interesting finding, both from your team and external researchers, is that people trust these notes much more than they trust true/false flags or misinformation flags. I'm curious if that was something that you suspected from the start, or where that UX design decision came from.
Keith: A recent study was done that shows that, indeed, people do trust notes that are written specifically about the posts they're on with details about the topic more than the classic misinformation flags — which is awesome. And, yes, it was one of our early design guesses. One of the working assumptions was that, if you could add context to the statements made in a post or a tweet, people would be better informed than if it was just some kind of generic statement. All the initial prototypes depicted very specific notes that were dealing specifically with the post in question. We showed these prototypes to hundreds of people across the political spectrum, and it consistently came up that they appreciated the specificity with which the notes dealt with the content of the post, and they appreciated that they had sources — which they all did.
Asterisk: The Birdwatch pilot is in January 2021, right? So this is a lengthy prototyping phase.
Keith: Yes. It started with two different prototypes depicting this kind of idea. We first tested it with a range of content and with a set of people from across the political spectrum. We were kind of blown away by our initial results. There were two different designs in the first test, but one of them tested really well. It did so well that we wondered if that was anomalous — we said, let's test this again, but with even more controversial topics. So we tested it again with posts covering Covid, Nancy Pelosi, Trump, and all the things that tend to raise a lot of political emotions. And again, it tested well. People from across the political spectrum would say: “Hey, yeah, you know, I generally like this person who's tweeting, but I appreciate this note letting me know that maybe this really isn't accurate.”
Asterisk: How early was this?
Keith: At this point, it was purely concept mocks built in Figma. We were trying to create the hardest conditions under which something like this needed to work.
Asterisk: And you found that what people liked were these very specific, targeted fact checks.
Keith: Yes. And importantly, that they were from the community. When we were testing, someone got one of the links to the prototypes and then sent it to an NBC reporter, so there's actually an NBC story with a bunch of them, and you can see some of the differences and similarities to how it works today. This was probably early 2020 when this was happening.
Asterisk: Obviously, this is right around when misinformation or conflicting stories about COVID became a huge topic. Did that influence your design process?
Keith: It was a good example of a polarizing topic for which we wanted something to to be found helpful, even by people who normally disagree. I would say it was yet another good example for testing on which the product had to prove itself.
Asterisk: So the other very famous element of Community Notes — at least in certain circles — is the bridging algorithm, which is the algorithm that the product uses to pick notes that are helpful and not politically polarized. I think Jay can probably speak to it best, but I'd love to know where in the design process that first came up, and the process behind it.
Jay: From the very beginning, we had this idea that we wanted the notes to be found helpful across the political spectrum. But there are a lot of considerations. We're balancing manipulation resistance, and when you have a totally open source data set and an open source algorithm like we did, you can't just naively add up the votes and see who has the most or something. So we considered a variety of classes of algorithms that have some manipulation resistance, like PageRank.
Actually, we spent a lot of time working with PageRank variants.
We landed on the bridging algorithm after basically implementing a bunch and evaluating them on a lot of attributes. Obviously it's tough to evaluate, but the bridging algorithm performed the best in these tests, and I think it's just very nice that you get this natural manipulation resistance, as well as only surfacing notes that are found helpful across the political spectrum.
Asterisk: Can you explain how it works?
Jay: The main rating action is that we ask people whether they found a note helpful or not. Then, we look at people's rating histories on previous notes. What the algorithm does is find notes where people who've disagreed on their ratings in the past actually agree that a particular note is helpful. That's not explicitly defined based on any political axis — it's purely based on people's voting histories. And this mechanism results in very accurate notes because, when you do have political polarization among people who've disagreed substantially, they really tend to only agree that notes are helpful when the notes are also very accurate.
Asterisk: There's this fascinating graph that you have in the paper describing this algorithm. I think it has helpfulness on the y axis and the polarization on the x axis. And you get this diamond shape which shows that most notes are very polarized one way or the other and only middlingly helpful, but there are very few super polarized notes that are also very helpful. There's this clear band of non-polarized helpful notes that falls out organically at the top. Is this just something that naturally fell out of people's rating behavior?
Jay: Yeah, I think even if you considered slightly different types of bridging mechanisms, I think you'd find something similar to this because there are a lot of contributors with varying quality and diligence. I've heard a critique that less than 100% of proposed Community Notes are made visible. Well, it's probably a good thing, right? Not every single proposed note is accurate and helpfully written. I do think the algorithm is imposing this particular diamond structure — if we did have a slightly different algorithm, you might see more of a curved diamond or a star shape or something if we were to regularize the model differently. But definitely you would see that the majority of notes do not have this bridging-based agreement.
Asterisk: It definitely matches my subjective impression. When I went in and looked at some of the notes that were flagged as very polarized, they tend to be less specific — like, say: “the 2020 election was decided fairly.” And then the helpful fact checks are more like: “this specific statistic about Covid is incorrect,” or “this specific event didn't happen,” or “the photo was taken from a different thing that happened three years earlier.”
Jay: One thing that people really like about Community Notes is that the quality bar is quite high. I think it just wouldn't be as popular of a product if we always showed a note, or showed something without first putting it through an algorithm.
Asterisk: The flip side of this is, do you find that the algorithm ever struggles to do fact checks on issues that are inherently very polarized?
Jay: Obviously, if raters who disagreed in the past can't find a note that they agree on, then no note shows. You could argue that maybe there should be a note in those cases, but maybe these aren't cases where it's possible to change people's minds with that note. Maybe a better note could be written that would change more people's minds, but if the existing note is not finding bridging-based agreement, this means there's a limit to its usefulness.
That said, I find that we still see a reasonable number of notes on even the most polarizing things. Often these notes are on quite objective things, like: “this is a video of a bombing from two years ago, not the current conflict.” Even people who really disagree on most things can often agree on notes like that.
Keith: People often ask us the question that you just asked. But if you look at the notes, most of them are on polarizing political topics. The vast majority of what people see in Community Notes is it dealing with super controversial topics in a way that people do find fair. It deals with elections. It deals with immigration and abortion. We've talked about this a lot throughout the development of the product — there could be many goals for a product like this, but the goal of notes is to actually be informative to people. If there's a note which is correct, but it's not going to inform people, is there a point in putting it up? There might actually be a cost to putting it up if people are going to feel that it's unfair or biased, and it may actually reduce trust in the overall system and thus reduce overall impact. So our focus is really adding notes where we think it will genuinely improve understanding for people from different points of view.
Jay: I'll also just add that the bridging algorithm almost works better in a polarized setting. If there's some topic that everyone agrees on, the quality bar of the note is still going to be pretty high, but people will agree that it's helpful even if it's not quite as well written or the source isn't quite as good. The more polarizing the topic is, the higher the quality of the notes can end up being.
Asterisk: So moving forward in our story — you launch the pilot, which is at this point called Birdwatch, in 2021. What was the process of getting that set up like? What did you learn from it?
Keith: We started with a tiny number of users. Before that we had initially tested it with some Mechanical Turk-type contributors, just to get a quick gut check on what people might actually write in these notes, but we didn't know how it would work in the real world. We initially launched with a really small participant base — 500 people on the first day, and we quickly expanded to 1000, but we ran at around 1000 to maybe 10,000 contributors for quite a long time. We learned a lot through that process. Just to give you a sense of how rudimentary the product was at that point, there was no bridging algorithm — it was just a supermajority rules algorithm, where a note needed 84% helpful to be considered valid. We also didn't show notes on posts.
And to see the notes, you had to go to a separate Birdwatch site, so you had to be really committed to participating in this pilot, because, again, we had no idea what was going to be in them. Was it going to be a dumpster fire, or was it going to be gold? When we were contemplating the design of the note page — the page that shows all the notes on a post — we actually talked about putting a dumpster fire GIF at the top just to prepare people for what might be below.
It turned out the quality was much higher than that. It's not always perfect, but it was much better than a dumpster fire. Still, it was a really basic first launch, and the product evolved a lot through what was a year plus in that pilot phase.
Lucas: One data point that helps illustrate how small scale we were at that time — I remember that we were at maybe 500, 1000 people, and most other experiments at Twitter back then would start at 1% of users. So we really started very, very, very tiny to learn and see, “What's the size of the risk we're taking? What changes do we need to make?” And from there we grew very, very slowly.
Asterisk: 1% of Twitter users would have been a couple million people?
Lucas: Yeah. If you think about any other features that are launching on platforms like this on any given day, they're usually at that scale — 5%, 1%, 0.5%.
Asterisk: So why the decision to start that small?
Lucas: The level of uncertainty was just very high. If you're going to launch a new video player, you can start at 1%. There's very little risk there. But if you're talking about a new concept that people have never seen before on the internet — we spent a lot of time trying to understand the best way to even explain what it was. Literally, what are the right words to put on the screen so that somebody reads this and understands what we're doing?
Asterisk: How big was your team at that point?
Lucas: Under ten.
Asterisk: I'm curious about what your feedback loops were like during production at this stage. What metrics were you looking at? What else were you paying attention to? How often were you making tweaks?
Lucas: We had multiple sources of feedback. There was the usage data, the notes, and the ratings themselves. We also did qualitative research — we watched people use the product, and they'd tell us what they thought.
Keith: Sometime early in the pilot, we created a group of users that we could interact with on a regular basis to get feedback — just daily observations or comments on new features we were contemplating launching.
Emily: And I think there were a lot of benefits to starting at that scale. I mean, I don't know what the comparison is — I've never launched a product at Twitter at 1% of the user base — but I think it enabled that very tight feedback loop. Our team was reading not every note, but every tweet about notes, and a good chunk of the notes themselves. We really, really knew what was in that database. And we could point to real examples when we thought something was a risk. Or when a risk we were worried about turned out to not be a big deal, we could deprioritize it.
The last source of feedback that I would mention would be the academic work that we were doing. One of you three can probably speak better to the impact on the product design, but there was a lot of work put into making sure every decision was made with intention. My team at UChicago and I were helping to facilitate the advisory board of different academics who worked on misinformation, worked on online communities, and had all this expertise to bring to bear. They could say, for example, if what you're interested in is building a community like Wikipedia, then what you need to do is start small and build up norms and things like that, based on this whole body of research in Human Computer Interaction. So we had feedback from the users, we had feedback from people who study these things, and then we had feedback from doing tons and tons of research. I think the iteration on that side was really fast because of that.
Jay: The iteration speed of our feedback loop was way faster than it was for other teams at the company that had to serve every user. Keith had our team set up as what we called a “thermal project.” It was this special mode where we could do crazy things and build hacky prototypes and ship quickly. I mean, we had a lot of flexibility to ship unpolished stuff and iterate fast because we had such a small set of users who had opted in to being part of a pilot. That accelerated us a ton.
Asterisk: What were the biggest ways the product and your thinking about it changed while the pilot was running?
Jay: One key thing was the algorithm development. We didn't have any data at the start, so we didn't know what type of algorithms might work. We collected data from the users in the pilot phase and then used that to iterate on algorithms — we could simulate some data, like adversarial attacks, but mostly we were just using the real data from contributors. By the time we actually launched to 100%, we had already gotten a long period of data, and we'd found a good bridging algorithm that worked. There was also the rating form. I know Emily and Lucas and I iterated a lot on the rating form.
Lucas: The options that people can select when rating a note was something we spent a lot of time on, both to try to figure out what data we needed to capture to make the algorithm work, but also what options we could present on a screen to have people think critically about the note that they're rating, and to help guide them to the ultimate goal of the of the product, which is to find an accurate helpful note. Emily helped a lot with that part.
But there were some very drastic changes. For example, we started with Community Notes being non-anonymous, so people's names were attached to their notes. This was the first design, and it was based on the intuition that in order to build trust, you have to see who was behind a note, or that perhaps we could build upon someone's credentials as an expert in some area. But very early on in this prototyping phase, we learned from our contributors that they were not comfortable with the chance of having their name attached to a tweet from the President, for example, or someone who has a large following, and that they would rather do this work anonymously. That was a very strong signal.
There was also a signal from academic research that, in anonymous systems, people may be more likely to share opinions without the pressure from their peers. Doing that switch from a non-anonymous to a fully anonymous product was a very large project, a very large investment, but we got enough signal in the early phases that we had to do it.
Keith: The other thing that emerged was that it became clear that notes shouldn't really stand on the author's reputation. The notes should stand on their own. You should be able to read the note, and it should give you the information and cite the sources needed for you to get what you want from it. It was much more powerful to do that than to try to rest it on one individual's identity. It was a surprise to us. In hindsight, it seems kind of obvious that it's better, but it was not our initial instinct.
Asterisk: And that is what falls out of those later studies that compare Community Notes to expert fact checks as well — the trust is higher.
Lucas: Yes. But one thing to note about that outcome is that we had to put a lot of work into overcoming people's priors. If you go back to 2021, and someone sees a tweet with a box on it, they immediately think, “Oh, this is a fact check.” They would assume that Twitter wrote it, or that the Twitter CEO decided that it should be there. What we're taking one hour to tell you here is something we had to explain to them in a split second with just one line of copy. Arriving at that design and what those words are — I don't think anyone here has ever done so many iterations on one rectangle. Things like, what's the shade of blue that will make people calmer when they see this? The original design that Keith made was an orange box with “This is misleading information” at the top. Coming from that design to what we have now was a learning process.
Keith: That line — “Readers added context they thought people might want to know” — we iterated on that line so many times to find something that could succinctly describe what had happened here, how this came to be, that this was by the people, not by the company, and that it was there for your information, not to tell you what to think.
Emily: I don't think you will ever hear any of us — anybody who worked on this project — ever say the word “fact check.” There's a care to avoid using that phrasing in any of the things we say about the product, any of the language about it, anything on the product surface, because it's entirely about providing context and information and then letting you make your own decision about how to trust it. That's what leads to that higher trust. But, as Lucas said, we're working through a lot of people's priors on what the box on a tweet means. And everybody else still calls it a “fact check.”
Asterisk: This might be a natural segue into talking about the broader rollout — this is, I think, October 2022 for America and then December 2022 globally. What changed when your user base expanded so dramatically?
Keith: We had tested the heck out of the product before that. One of the things we haven't talked about, but that we observed in the pilot, was that contribution quality was very mixed. We had developed this system through which people earn the ability to write and also can lose the ability to write if they write junk that other people don't find helpful. We had built the bridging algorithm, and the bridging algorithm had been live in production, with about 20% of the US population as viewers for a number of months. And we had run a significant number of tests on note quality. We were evaluating whether notes were found helpful across the political spectrum in survey experiments and other tests. We were evaluating note accuracy. We were evaluating to what degree notes impact the sharing of posts. So the system had been tested at quite a significant scale already, and we felt pretty confident that it was going to roll out and note quality would be reasonable. And also, if for some reason there was a problem, we could always turn it off or dial it back.
Broadly, when we launched, it worked. The note quality was pretty high. The earned capability system, the reputation system, and the bridging algorithm led to the notes that were helpful — genuinely being found helpful — across the political spectrum. And I think you could see that in the dialogue after the launch. I remember very early after launch there was a note on a White House tweet, and they retracted the tweet and updated the statement. What an incredible power to have put into the people's hands — that regular people on the internet can call something out, and it can change the way an important topic is discussed. It was pretty remarkable.
Asterisk: That's one reason I really wanted to do this interview — people seem consistently very impressed by the quality of Community Notes, and I wanted to know what went into making that happen. But I also want to talk about some of the challenges of scale. This is sort of conceptually complicated, and I'm interested in how you think about this, but — how big is the product now, and how big would you like it to be? What percentage of tweets get noted? In an ideal world, how many do you think should be getting noted? What's the gap?
Keith: We sometimes phrase this as: “What's the total addressable market of notable tweets?” It's really difficult to know, and if we knew, it would be really helpful. The way we would want to define it is: How many tweets or posts are there where there exists a note that people who disagree would find helpful? But then there's also the question of visibility — it's much more impactful to have notes on higher visibility content than on content that's not seen by anyone. But it's hard to know if there exists a note that would be helpful to people who disagree. Our assumption is that the answer is there are more tweets like this than we have notes on today, but we don't know what the limit is. And so, generally, we just try to expand the program to cover more content. But we're constantly measuring whether we're still upholding the high quality bar that these notes are indeed helpful.
Asterisk: Can you talk about some issues you've faced as you try to expand?
Jay: It definitely seems like different people have different preferences on how many notes they want to see. Some people want notes on every single tweet, even if they're accurate, because it's just cool to read more context. And some people think that even notes on misleading stuff shouldn't be needed because people should just know.
Keith: Particularly with satire or jokes — that can be an area where people disagree. Is it obviously funny? Does it need a note or not? That's why we like the approach we take, because it leaves it up to users. And we'll do what seems like humanity's preference instead of us making a decision about that.
Asterisk: Another thing I wanted to talk about is speed. I was reading a preprint by Yuwei Chuai's group at the University of Luxembourg.
The paper is about the overall impact of Community Notes on misinformation — basically, they found that when a tweet is noted, this does reduce engagement, but this still has a pretty minimal effect on the overall spread of misleading tweets because a note needs to appear really, really fast to have an impact. The statistic that made my eyes pop was that the half-life of a tweet is something like 79 minutes. Half the impressions it's ever going to have, it has in the first 79 minutes. Now, I know you've done a lot of work on increasing the speed at which notes appear, from around five days, or something, early in the pilot to within a day or so now. What are the challenges in making notes happen faster?
Jay: Great question. First off, I just want to talk about the half-life of a tweet. I think in this paper they looked at the firehose of all tweets and then took the half-life from that. But, you know, the median tweet doesn't get a lot of engagement. If you're talking about the median tweet that goes viral above some certain threshold, then the half-life is many times longer.
Asterisk: And I believe that viral tweets are more likely to get notes?
Jay: By far. Because in order to have enough people see a tweet so that a note gets written, and then for that note to get enough ratings to show it, they're —
Keith: Typically being seen by a lot of people for up to 24 hours. Not 79 minutes. It's a much longer time window.
Jay: And then, even if you did see a post before it got noted, if you engaged with the post, we'll send you a notification afterwards with the note, once the note's been rated helpful. As far as speed, we're doing a lot. I think that the speed has been improving pretty rapidly. We've done a lot to optimize the data pipelines behind the scenes, but also things just get faster as we get more contributors.
Keith: When we first started, going back to the pilot phase, the focus was entirely quality. The thinking was, we'll deal with speed and scale as we grow. As you mentioned, in the pilot, it would take multiple days for a tweet to get a note. But no one was seeing these things back then. There are a couple places that add time to the process. One is the organic time for someone to decide a tweet or post might benefit from a note and then for people to rate it. And then there's the time to actually score the note. We're working on speeding up both of those. The scoring — that is, the frequency at which we can score — used to be three to five hours, and it's soon going to be in the minutes.
This means notes can now go live within minutes of being written and rated. And then, you have to compare that to the alternatives. It's extremely common to see professional fact checks take multiple days. We see this all the time. In the Israel-Hamas conflict, in those first few days of the conflict, there was so much misinformation. There were people posting video game footage, claiming it was happening in Israel. There were pictures from other countries from prior conflicts, saying this was happening in Gaza, and notes were appearing in small numbers of hours. I think the median time was about five hours. That was before all these speedups we've done. And then, some of those same corrections were only published as fact checks two to four days later. And so, already at that point, notes were vastly outperforming the status quo.
Jay: On top of that, we also do media matching. That is, we give our top writers the ability to write a note that's actually about the media on a post instead of the post itself. And then, when a note like that is rated helpful, it will show on all the matched copies of that media across the platform, which can also happen within minutes of those posts getting created. Statistics that are based on the public dataset on a per-note basis are often not counting media matches, which really, really speeds up the median time to note.
Keith: It's hard to know because it's hard to run the test in the real world, but I suspect that there are large numbers of misleading memes or ideas that would have gone viral but haven’t because of notes and media matching in particular. It's common for us to see these cases where a post goes live with an outdated or fake video. It's got a note within a small number of hours. It's a media note. It's instantly matching on every other post that uses that media. I would guess that prior to the existence of Community Notes, there would have been a lot of copies of that image being shared around.
Asterisk: Obviously, content moderation at scale remains a huge problem across social media. What are the biggest lessons that you've learned about developing new content moderation methods? What would you most like to say to teams at other companies, or people working on this problem broadly?
Keith: One of the biggest challenges in the moderation space is delivering outcomes that people feel are fair. One thing I think Community Notes does well is that it tends to deliver notes that people find to be fair and helpful. And I think they find the process to be relatively fair, too, because everyone has a voice. It's based on open source data and code that's in the public, so people can audit it and critique it. Other areas of moderation face this same fairness challenge. I would love to see new approaches that try to make those decisions in open, fair ways that people trust. I suspect they would be really successful and that it could ultimately lead to outcomes on these really controversial decisions that people can get behind — even if they disagree with them.
Jay: I think there's a lot of little micro design decisions that we made along the way that are pretty helpful for many types of moderation systems — like anonymity, actually using the crowd, getting moderation input from users rather than just mods, and having an algorithm find points of agreement — and then the design choices like adding friction. Much of the time people are not super diligent if they just have to click one button angrily. But if people are going to put in a lot of effort to write something, they're probably going to be more careful. Maybe Lucas and Emily could talk more about that.
Lucas: I think there were a lot of design challenges that we had to overcome because we stuck to some principles that we set out at the very beginning. There's the fact that the data is open source, that the code is open source, and that we still don't have any buttons on the X side to promote or demote any note. We've never changed the status of an individual note. We either take the entire system down, or it's running. Those three non-negotiables created a lot of work for us. The anonymity part, the reputation part, all of the small details on the screen — each of these makes it so that people can, at the end of the day, see a note and know, “I can trust this.” But that full circle was years and years of work.
Emily: As somebody who does not work on social media content moderation, I think there's actually a lot you can learn by looking at Community Notes for design choices, principles and values, and decisions you have to make when you are in a project of truth-seeking — no matter what that is, right? I work at a charity evaluator, and you would think that has nothing to do with social media. But it informs a lot of the way that I think about truth-seeking in general.
Asterisk: Did you ever get any pushback to the idea that you can’t change the status of a specific note?
Keith: We’ve always had support for the approach that whether a note shows is up to the people, not a company. And that the process is auditable and verifiable — you can download the code and the data and reproduce the same results you see on X. Our principle is that we are building a system to produce notes that are helpful and informative, through a transparent, open, public process. And if there’s a problem, it’s not a problem with a note, it’s a problem with the system. So we’d rather take down the whole system — and improve it — than take down a note.
The only case where the company would take action on a specific note would be if it were to violate platform rules, but the bridging algorithm and the process through which people have to earn the ability to write notes inhibit this in the first place. It might seem surprising that a company is OK not having an override button, and perhaps it could feel uncomfortable to not have one, but I think it aligns with a noble sense of how people want the world to work. It just feels fair and clean and principled.
Asterisk: You also mentioned that one of your principles was open sourcing code and data. What impact did that have?
Jay: This is a huge — but valuable — constraint we put on ourselves. When people think about the obvious ways one might try to identify what’s helpful to people who normally disagree, they often think about using tweets or likes or engagements — or something like that. But because we wanted the algorithm to run entirely on public data, we ruled out using any of that. We wanted the algorithm to run entirely on the contribution data from Community Notes itself, which was public. This led to a quite novel and robust bridging-based matrix factorization algorithm, which is naturally manipulation resistant — even when open-sourced. Rather than looking at tweets or engagements or things like that, it looks at how often people agree in their Community Notes ratings.
A key benefit of that approach is that people have skin in the game when they make those ratings, as those ratings actually elevate notes. So there’s a fundamental incentive to rate in a way that’s aligned with your actual point of view. Additionally, the open source data has proven helpful. For example, in addition to increasing transparency and trust, it has enabled independent, external research on Community Notes, like two recent studies that showed notes reduce resharing of posts by 50-61%,
and increase deletion of posts by about 80%.
And a study in the Journal of the American Medical Association found notes they reviewed to be highly accurate.
It’s even allowed us to accept code changes from the public. The code that scores notes actually includes code written by people outside of the company and submitted in GitHub. We’d love, ultimately, for the algorithm itself to be written by the people — just like the notes themselves.