Projekt:Podcast/Avsnitt 294
Avsnitt 294
Inspelningsdatum: 16 december 2024
Publiceringsdatum: 20 december 2024
Lyssna:
Extern länk: https://wikipediapodden.se/automoderator-294/
Programanteckningar
Programledare är Jan Ainali.
Special episode
This is a special episode about Extension:AutoModerator. Here we meet Sam Walton from Wikimedia Foundation, talking about the new tool.
Mentioned in the episode:
- The Wikimania talk (AutoModerator starts at 13:52)
- The dashboard
- How to deploy the tool in your wiki
- Community Configuration
- All episodes in English (podcast feed)
Transcript
And it uses a machine learning model to find edits that are bad, that are ideally vandalism, and then automatically reverts those. So the intention is that it should save patrollers and administrators time.
Hello, and welcome to Wikipediapodden. My name is Jan Ainali, and today we're going to talk about an extension that's not yet available on all the wikis. It's the extension AutoModerator. And with me today to talk about that, I have Sam Walton, who is the Senior Product Manager for Moderator Tools at the Wikimedia Foundation. And even before he joined the Wikimedia Foundation, he was a Wikipedia editor, and he has been an administrator on English Wikipedia, I think, since 2014. So he knows what moderation means in our ecosystem. Welcome, Sam.
Thank you. Thanks for having me.
So I saw your talk at Wikimania 2023, when you were still soliciting new names for the tool. I guess you didn't get any more suggestions, because we're still calling it AutoModerator.
Yeah, it ended up being the best bet, I think.
But for people who have never heard about this and missed that talk, I'm going to link to it in the show notes, because I thought it was a good introduction. Can you give us a brief overview? What is this extension? What is meant to solve?
Yeah, so AutoModerator is basically automated anti-vandalism software. So the idea is that communities request to get it turned on, and then they can configure exactly how it works. And it uses a machine learning model to find edits that are bad, that are ideally vandalism, and then automatically reverts those. So the intention is that it should save patrollers and administrators time, not having to revert as much edits, because AutoModerator is handling some of the load. And as I say, yeah, completely configurable by administrators as to exactly how it works.
Since you gave that talk in 2023, has anything changed in the trajectory on how you're thinking about the tool?
Yeah, that's a great question, because I don't remember exactly what I said in that talk. But I think broadly, we built kind of what we were aiming to build. And one of the reasons I think that's probably true is really a lot of the design work had already been done for us by the community, because we have these, I don't know, nine or ten, I was able to find, bots that have been built by volunteers in the past. You know, they've been running for a long time at this point, many of them. And so we were really able to look at those and figure out what do volunteers appreciate or not appreciate about these tools? How have they been built? How are they working? And kind of use that as a blueprint. And so I think even from when I gave that talk, we had a bit of pretty good sense of kind of what this was going to look like.
You mentioned machine learning there. So this is, I'm guessing now, because I don't know, but I'm guessing that it's something more advanced than just abuse filter. It's not just matching on simple strings, but there's some more kind of pattern recognition behind it. How have you been giving input to this model that is going to be working with the tool?
Yeah, so I can't go into all the details here because the model was built by our research team and like my team wasn't directly involved in it. We super appreciate all the work they do. But I can give kind of my surface level understanding of how it works.
That will be good enough for also for our listeners. No one is an expert in AI, even though we hear about it all the day.
Of course. So the model that we're using is called the Language Agnostic Revert Risk Model. So if you've used the Ores filters on recent changes on Wikipedia, this is kind of like an upgraded or newer version of those filters, of the models that those filters use. So the idea is that the model is trained on the reverts that Wikipedians have done on Wikipedia over, I think, over about at least a year it was trained on. So looking at all the kinds of reverts that have happened on Wikipedia, and then specifically it looks at kind of metadata about those edits. So it's not actually looking at the text of the edits, which is how it can be language agnostic and work on any language Wikipedia. It just looks at the metadata of those edits. So how long has the user been editing? How many edits have they made? What did they do to the article? Did they remove images? Did they remove citations? Did they remove paragraphs? Like, what is the quality of the article before and after the edit? And so by kind of smooshing all that data together in some machine learning way that I don't understand and kind of comparing that over all of these examples of edits that were or weren't reverted, it can then kind of generate scores for future edits to determine how likely is it that this edit should be reverted? How much does this look like past edits that were reverted? And then we can just kind of set a cutoff there and AutoModerator reverts anything above that.
All right. I mentioned in the beginning that this tool was not enabled on all wikis yet, but also implying that it is enabled in some wikis. So who are using this and what are the feedback you are getting from them?
Yeah, so it's deployed on, I think, six wikis now. We just had a new one turn on this past week. So it's on Indonesian Wikipedia, Turkish, Ukrainian, Vietnamese, Afrikaans. And I think Bengali is switched on. We had the request from them pretty recently, but I think that's now switched on. So six Wikipedias, because the revert risk model does only support Wikipedia at the moment, unfortunately, so we can't support other sister projects. But yeah, it's on those six wikis. And I think the earliest that we switched one of those on was about five or six months ago, the first wikis that we were testing on. And the feedback so far has been pretty positive. It can be hard to get feedback about these kinds of tools, because they're the kind of things that if they're working well, people are happy. It's just happening in the background, right? So we are kind of trying to actively survey these communities to hear what they think. But so far, what we've heard has been pretty positive. If anything, the only, I don't know, criticism I've heard is like, it should revert more. Like, we want it to be, you know, taking even more of a role in reverting. So I think that's about the best criticism we could probably hear from it.
And you built some kind of dashboard as well, so you can sort of like get an overview of what the tool is doing, both like on each of the projects it's enabled on, but also like sort of like an aggregate. What are you hoping to find in that, or how are you tracking it for your usage?
Yeah, the dashboard was, I'm really pleased with how it turned out, because it was something right from the very beginning. We knew we wanted AutoModerator to not be this kind of black box or thing that you, that communities sort of turned on and forgot about. You know, we wanted you to be able to always go in and see what is it doing? How is it behaving? You know, what kind of actions is it taking? So, yeah, the dashboard is kind of like, serves sort of two purposes. On the one hand, it's useful for us. We can see those top level metrics, how many edits have been reverted, who's being reverted, where is it enabled, that sort of thing. And then on the other hand, we hope that it's useful for those communities to be able to see, okay, on my wiki specifically, like here is how it's behaving. One of the trickiest numbers in that dashboard that I still don't think is perfect, but it's an interesting one to think about, is trying to define the false positive rate. Because, of course, AutoModerator isn't perfect, it will sometimes revert a good edit that it thinks just looks like a bad edit. But trying to nail that down to define in data, like what is a false positive is really hard. Because you've got to figure out, well, sometimes AutoModerator will be reverted by the vandal, right? And that could be a false positive, or it might not be a false positive. Sometimes it's going to be reverted by an experience editor, and that probably indicates it's a false positive. But there might be something else going on there. They might have edited the article in a way that looks like a rover, you know, so it was a pretty hard number for us to figure out. But I think we've got something that is broadly accurate.
There was one number there that got me a little bit confused, because there's some sort of like time to revert, and on average. And I was sort of like, when I was imagining this tool, that the revert would be rather instantaneous, like immediately after. This time suggests it's waiting a little bit. What's happening there?
Yeah, that's an interesting one, actually. It was a question that we had early on. If I can go on a minor tangent here.
Yeah, sure.
When we were first designing, actually, I think during that 2023 presentation, I might have said, in one of the early descriptions we had of AutoModerator, we said, prevent or revert. Because we weren't sure if we were going to have AutoModerator do an undo, like a contributor, or have it kind of block the edit, like abuse filter. And ultimately, we decided on revert for sort of transparency reasons, because it's much easier to just slot that, you know, you see it in the page history, you see it in recent changes, you can track the contributions very easily. With preventing, like with abuse filter, you have to have this whole separate log, this different place to go and see what is it doing, and what has it stopped, that kind of thing. And so we opted for revert. And so what that means is, although AutoModerator kind of triggers as soon as an edit happens, it has to put that job into a big queue. And that queue is along with every other edit that's happening on every wiki that it's deployed on, right? And so it's working through that queue and checking the scores.
So it's the famous job queue.
Exactly. It is literally the job queue. And so that is really just a measure of how quickly is the job queue moving. But yeah, I mean, in almost all cases, edits get reverted within a minute. So it's pretty fast.
Yeah. So if someone now listening to this podcast gets, oh, hey, we should have this on our wiki. How should they go about to get this enabled?
Yeah. So we've got the process lined up on MediaWiki.org. So if you go to extension colon AutoModerator, you'll find the kind of overview of how the extension works. And linked at the top there, there's a link to a subpage, which is slash deploying. And that has the steps for getting it set up. So the first step is community consensus. Although you individually might be very excited about AutoModerator, your community might not be. So we'd love to know that there is some interest in having AutoModerator before then. You don't need to have all the details figured out, just an indication that you're interested in general. Once you've got that consensus, you can file a fabricator ticket. We have like a phabricator form that has like the steps laid out in there mirroring this page. So you can kind of check them off as you go. And then there's just a series of setup steps. They're all relatively minor. A lot of them are about localization. You need to pick a username. You can have a different, it doesn't have to be called AutoModerator. Quite a few of the projects have called it AutoModerator.
So it's sort of like it's going to be a user in the recent changes log making the revert.
Yeah, exactly. Technically, it's a system account. Yeah. But that means it just shows up like a user account.
Like the MassMessage thing?
Yeah, exactly. Exactly. Yeah. So you can pick the username. It also needs to not be currently used by anyone, but it gets reserved by the software once you've picked it. And then you create kind of a user page for it because it is ultimately an account. We request that you create a false positive reporting page. So again, I mentioned false positives will happen. And we try to prominently link to that page so that if I'm a new editor and it happens to me, it's like very clear how I can say, hey, something went wrong. Maybe I don't know how to undo AutoModerator. I don't have to reinstate my edit. At least I can report it to someone and say, I think there was a problem here. And then, yeah, basically once those steps are set up, it's just on us and our team to do the software configuration. We'll deploy it. And then from there, the rest of the configuration is in the community's hands. So us turning the software on doesn't start it running on the project. It just becomes available in a disabled state. And then any administrator can go to a special page, a community configuration page.
Which is also very new.
Yeah, it is. Yeah, I can talk about that as well because I think it's a very cool software.
I think we can talk about it a little bit because I imagine not everybody who's listening to this have had the chance to encounter it because it's for, like, you're using it for the mentorship module and new users things. But not everybody is looking at those kinds of nooks and corners of the wiki.
Yeah, yeah, so I can briefly explain then. So community configuration came about because I think historically many configuration changes that communities want to make to software on the projects. The process for doing that is you go and file a fabricator ticket and you find your friendly neighborhood developer who's happy to go and make that config change for you. And then eventually it rolls out, you know, and, you know, all these steps have to happen first. But there are a bunch of features on the wiki that really don't need developer oversight to change the settings for, right? So the growth team at the foundation, one of the other product teams, built this software called Community Configuration to use initially for their software for the newcomer homepage. But they've now made it available to other tools to use, basically. So now any new tool that's made could have a community configuration config page. And the cool thing about it is instead of having sort of an opaque configuration that is hard to track, whenever you edit the configuration form, it makes an edit to a media wiki namespace page, a JSON file. So there's a structured version of the config, and then you have a nice UI that enables you to click checkboxes and use dropdowns and stuff, but then makes an edit to that page. So anyone can watch this, that page, they can see what is the configuration for the moderator, how is it changing, who's changing it. But then the form, the UI is only editable by administrators, but it's editable by any administrator. So anyone can kind of go in there and tweak it and discuss what changes they want.
And with this sort of like safety net, like even if it gets enabled, it's not turned on, so to speak. Like if the extension is enabled, the functionality is not turned on. You seem to me a little bit overly cautious, like why do we need the community consensus? Are there any sort of like risks or extra burdens that can come on to a community that enables AutoModerator, or is it just to be extra kind to the community and not force anything on them?
There's definitely an element of that kindness to it. You know, we wanted to make sure that because this is ultimately kind of automated actions that are happening on Wikimedia pages, we wanted to be very cautious that this was something that the community controls, not the foundation. You know, the foundation isn't making these edits, right? The community is turning a thing on that makes the edits. So we wanted to make sure that distinction was clear, and we just wanted to be really overcautious about ensuring that the community felt like this was a thing that is in their control, that they can turn on or off whenever they like, that they have the control over the configuration for. And really, we as the product team are just serving a role of technically making that available. So there's that feeling, but we also want to ensure that there is enough community awareness and buy-in to be monitoring what it's doing. Because again, we know it's not perfect. I'm not going to tell you that it's going to be right 100% of the time. And so we want to ensure that there's enough community members who are aware of it and are thinking about it, that they are actually following what's happening and paying attention to it.
So from where it has been turned on already, how much sort of like wins or benefits or time saved has it been to these? Does it need to be a wiki with a lot of vandalism going on for it to be worth to turn on? And what's your thinking about that?
Yeah, there's a few aspects to that. I mean, one thing I'll say is I think it's very qualitative. It's really very much the vibes of how do patrollers, administrators feel about it. So I think it will vary from wiki to wiki. And I think also because it's not just about how many edits get reverted, it's also about how quickly they get reverted. So there's a lot of factors there. It's kind of hard to say for sure that it's this much of a time-saving or anything like that. But I don't think it's only useful on the big wikis. I certainly think that's where it shines because of the numbers involved. But I think if you think about in aggregate over even smaller wikis, that's a lot of reverts that are happening and a lot of bad contributions kind of being reverted. So I still think that's worthwhile. And the other interesting thing I think is around kind of who is seeing bad content is around the time to revert. Because you ought to think that every minute that some bad content is in an article, that could be who knows how many people seeing that content and going, huh, Wikipedia sucks. It's kind of going away. One interesting thing that we did as part of this project is we added some extra data tracking to our page views data so that we could track what revision of an article did someone see. You know, on a page view, actually what version of a page were they seeing? And so that allowed us to kind of calculate in aggregate how many page views were to content that was vandalism, that was like bad content. I don't have the numbers to hand, but it's pretty interesting to see how that changed from one wiki to the next.And so we could kind of quite directly calculate, okay, if AutoModerator reverts within a minute, say, how many page views is that? And it was a pretty significant number of page views that that is preventing just by being operational, even at like the highest threshold.
I think we have a famous, around the Nobel laureates' announcements in 2009, I think, where there was a vandalism in an article. It only lasted for less than a minute, like it was reverted within the same minute. But a newspaper made a screenshot during that time. They missed the vandalism in the page and published a screenshot, literally saying that the literature prize winner has written a book called Poop. So it's fairly innocuous, but still, it's sort of like a makes for a funny story.
Yeah, absolutely.
What I've been thinking about as a possible advantage of this tool is like it's always on, especially for a community which might be very narrow in a time zone. Like, for example, Swedish Wikipedia, which there's in the middle of the night, like most editors are in Sweden. So if someone makes a vandalism there, it's going to go until maybe four or five in the morning until an administrator comes up and wakes up, the early ones. Is that sort of like also a possible win or am I hoping too much?
No, that's definitely part of it. One of the things I learned about before we started on this project, I was interviewing a patroller on, I forget, I think it might have been Tamil Wikipedia. It was an Indian language, Wikipedia, and they were saying that they used the recent changes special page to patrol for edits and that that generally worked for them. You know, it was it was few enough edits on their wiki usually to be able to use that. But they found that whenever a big event happened or maybe it was the weekend, there would just be way too many edits because like things are happening, you know, there's news coming out, there's articles to update and it sort of stopped being useful for them. And so I think, yeah, that that stands out in my mind as an obvious advantage. OK, if you suddenly can't actually see everything anymore that's going on, at least, you know, there's something like ticking away in the background, catching some of the stuff that you're missing.
And I don't think you mentioned now, but I think I read it somewhere, like in this community configuration, you can sort of tweak how strict you want the AutoModerator to be like, you know, take only the absolutely worst or take a little bit more, but there might be a little bit more false positives. Is that still something you have in that configuration?
Yeah, that's that's sort of inherently a trade off you have to make with machine learning models is is this between I think technical terms are like precision and recall, but I always use the words kind of accuracy and coverage, you know, how accurate is it going to be? How many edits is actually going to find? And that's always going to be a trade off is is the more accurate you get, you know, the percentage success rate is going to be higher, but you will actually catch a lower percentage of how much vandalism there is in total. So that was something that we did a bunch of testing on beforehand, we had this kind of open process where we'd set up the spreadsheet with a bunch of data and kind of what AutoModerator would do at different thresholds, and we collected a bunch of data and sort of set some thresholds, a few thresholds that we thought kind of spanned the range of where we felt comfortable, you know, at the top end, you're going to relatively few times, but there's pretty high accuracy of those reverts. Whereas at the lower end, yeah, you're going to get more false positives, but you are going to catch more of the vandalism. And we sort of intentionally bound that so that no, couldn't go super low, right? Yeah, kind of be reverting, whatever.
So the communities who has this on, if they feel that they're either not reverting too much, or they're getting too many falls, they can go in and tweak this, and they don't have to make a fabricator ticket, and it will just be on the next.
Exactly. Exactly.That's the beauty of community configuration.
So what's next in your plans for the tool? What's in the roadmap?
Yeah, so there's two big things, I would say, that we're still sort of working away on in the background. The first is integrating a different, a better, we think, model than the one we're using right now. And the other one is support for stewards to turn AutoModerator on, on small wikis. So on the new model, the one we're using right now is the language agnostic revert risk model, but there is also a multilingual revert risk model. So this is one that actually does look at the text of an edit. So that one uses a large language model to kind of analyze the text of the edit, and that is also then a signal as part of that package of data that's being looked at and trained on. So that one only supports, I think it's 47 languages because of the limits of that model. But the work that our research team has done has shown that that is more accurate than the one we're using right now. The complication is it varies by language. And so whereas with the language agnostic model, we could just set one set of thresholds with this, we're still trying to figure out how we're going to handle that. It might be that we need to give each community sort of full control over what number they pick, basically. So we're still playing around with some ideas there about how to progress with that one. And then the second thing is, yeah, as I mentioned, stewards control for small wikis. So, you know, we have 300 language Wikipedias, but actually most of them have either no administrators or, you know, like one or two, right? But they still get vandalism. And so we'd like to be able to give stewards the ability to kind of centrally set AutoModerator up on these projects and control how it works and turn it on or off on individual projects if they want to. So that that one administrator on the small wiki doesn't have to worry about it.
And perhaps as a final question, then, is there something that I missed ask you about the tool that you sort of like were dying to tell me about? Like, oh, I hope I get this question.
Do you know, I don't think so. I think maybe the only sort of feature of AutoModerator we didn't talk about is the talk page message. So this is, it's a configuration option. You don't have to have this turned on. You can turn it off if you want to. But I think we would encourage using it. Basically, whenever AutoModerator reverts someone, it can send them a talk page message. And it says, hi, I'm AutoModerator. I'm a bot. You know, I've reverted one of your edits. And we very much assume that this is being sent to someone who is the recipient of a false positive, right? If it's a vandal, who cares what we write? But if it's a good faith user, then we wanted this to kind of explain, here's what happened. Like, it's okay. It's not a reflection on you. You can report it here, et cetera, et cetera. And so I think that's a nice little feature just in terms of communication and ensuring that, especially when we build these moderator tools, that we're not just centering the moderators. We're also thinking about the people who are the recipient of those moderator actions.
That got me thinking. I've been doing some patrolling as well and seeing some behavior from vandalists. Have you seen that vandalists replying to this and say, yes, this is a false positive, claiming it was a good one? Like, have you, or are they just not that tenacious?
Do you know I haven't looked, but I'm really curious now. I'm curious to go and investigate some of these messages and see if we've had any responses because I haven't honestly been looking. So probably I suspect that that happens. You know, if I think about my experiences in English Wikipedia, I definitely see that with vandals responding saying, no, I didn't or, you know, whatever. So I'm sure it must be happening, but I've not looked.
All right. Thank you very much, Sam, for telling us about the moderator tool.
Yeah, thank you for having me. It's been great.
So this has been Wikipediapodden, a special episode about extension AutoModerator. You can find all our episodes in English, because we're doing weekly in Swedish as well, in the show notes. And it's under tag slash English on wikipediapodden.se. And happy holidays, everyone. Thank you.