Wikipedia:Edit filter noticeboard

From Wikipedia, the free encyclopedia
Jump to navigation Jump to search
Welcome to the edit filter noticeboard
Filter 1009 — Flags: disabled
Last changed at 21:01, 8 April 2022 (UTC)

Filter 102 — Actions: disallow

Last changed at 02:19, 8 April 2022 (UTC)

Filter 958 (deleted) — Flags: disabled

Last changed at 02:03, 8 April 2022 (UTC)

Filter 874 — Actions: disallow

Last changed at 23:17, 6 April 2022 (UTC)

Filter 1196 — Actions: disallow

Last changed at 23:19, 6 April 2022 (UTC)

This is the edit filter noticeboard, for coordination and discussion of edit filter use and management.

If you wish to request an edit filter, please post at Wikipedia:Edit filter/Requested. If you would like to report a false positive, please post at Wikipedia:Edit filter/False positives.

Private filters should not be discussed in detail here; please email an edit filter manager if you have specific concerns or questions about the content of hidden filters.


Click here to start a new discussion thread


We need to talk about filter 874[edit]

  • 874 (hist · log) ("LTA username / impersonation creations", private)

I was looking through the log here, and frankly, it looks like the majority of hits are false positives. Now it's not possible to be sure; when you stop an account from being created, you don't get to find out what they were going to do. A few examples:

  • "Fabonz": I'm guessing this is supposed be an impersonation of Favonian? But I doubt it.
  • "Bl223907": is supposed to be Bbb23, maybe?
  • "Varajeezuz": zzuuzz, maybe?
  • "Enas Ahmed Eisa" Medeis, it would seem.

And so on. Should we:

  • Go through this giant ball'o'cruft and carefully fix each of these of FPs?
  • Drop to tag-only, and rely on people reviewing Special:Log/newusers?
  • Throw some WP:TNT in the whole thing and start over?
  • Something else?

I've never understood the value of disallowing LTA usernames. They are going to pick and another name and disrupt anyway. Except, unless you're a checkuser, you don't know the name they picked. If they stay with "[admin] is a wanker", their edits get reverted on sight. If their second attempt is "BoringUser37823" maybe their edits stick. Suffusion of Yellow (talk) 23:20, 22 March 2022 (UTC)[reply]

Yes it is a problem, and has been for a long time. Unfortunately false positives are not usually very visible with this filter. I'd be inclined to work back through the false positives, as well as make some other common sense pruning. I do see some value in having a filter which prevents names which are abusive, and helps with denying recognition to some memes. Sometimes the username is a key part of the disruption. -- zzuuzz (talk) 00:44, 23 March 2022 (UTC)[reply]
Thank goodness for https://regex101.com/debugger... so far I've managed to combine some patterns together, and I think we should probably split the filter out into more specific issues ~TNT (talk • she/her) 12:32, 25 March 2022 (UTC)[reply]
Moving a couple of things into Special:AbuseFilter/1196 ~TNT (talk • she/her) 13:21, 25 March 2022 (UTC)[reply]
Thank you! I'll keep an eye on both. Suffusion of Yellow (talk) 19:31, 25 March 2022 (UTC)[reply]
Honestly I'm not entirely sure what's being attempted here. It looks like we're just forking the problem and I predict the same issues will persist, as seen in the first log entry of the new filter. It really needs a big old prune (along with more than a few more word boundaries). I'll take a scalpel to both filters in the near future. -- zzuuzz (talk) 19:45, 25 March 2022 (UTC)[reply]
I've yet to find a time slot to dig though this, but I just wanted to add that I think we're also going to have to also talk about 102 (hist · log). -- zzuuzz (talk) 04:51, 31 March 2022 (UTC)[reply]
  • Someone just pointed out Special:AbuseLog/32282853 to me, which, arises from an issue in 874's 79th alternative. To allow discussion without leaking a private filter, I'll anonymize it as having the following format: abc.d?[e3] . Thus the issue arises from the confluence of the wildcard, the question-mark quantifier, and the lack of a boundary assertion at the end.. To me, that's far too much flexibility to be having in a filter that leaves no obvious avenue for appeal. Since we've been talking about this for a bit, I'd like to make a proposal: All patterns in 874, 102, or any other account-creation filter:
    1. Must not, when excluding any characters that are quantified at minimum length 0, consist only of 3 or fewer literal ASCII characters (like the pattern in 102 that could be reduced to ryr)
    2. If they specify only 4 to 5 literal ASCII characters (like the 79th alternative in 874), must not contain any wildcards mid-pattern. At a minimum, they must use \S or \w, but preferably something narrower than those. Depending on what the characters are, this may be advisable even at higher character counts (as in the Medeis example).
    3. Must not contain any mid-pattern quantifiers of wildcards or large character classes with large or infinite maximum lengths (e.g. zz[a-z]*uu[a-z]*zz, unless both ends are very very narrowly tailored (e.g. Suffusion.*Yellow).
    4. If any string they match could plausibly occur in any context other than abuse, must start with a ^ or \b and end with a $ or \b. (So a direct match on a username that isn't a word outside of Wikipedia (e.g. zzuuzz) doesn't need such an assertion, nor does one with some basic substitution or repeating-character quantifiers fit in (e.g. z[sz]+[uv]+z[sz]+), but something that could arise in a normal context (e.g. zuz, although that also breaks rule 1) needs those boundary assertions.)
  • Exceptions could be made in emergencies or by concurrence of two EFMs, to be noted in the filter commments, with the understanding that they will monitor for FPs. Thoughts? -- Tamzin[cetacean needed] (she/they) 21:15, 1 April 2022 (UTC)[reply]
  • @Tamzin: Sound like good ideas, but those giant crufty regexes just give me a migraine. So, I boldy switched 874 to /x ("ignore whitespace") mode. We haven't done that with a filter before AFAIK, but maybe we should start. If no one reverts that change I'll start pruning. Suffusion of Yellow (talk) 00:16, 2 April 2022 (UTC)[reply]
  • @Zzuuzz, TheresNoTime, and Tamzin: Just did a major tidy of 874 and 1196. In the end, I downloaded the entire filter log (about 22000 hits) log so I could grep locally instead of waiting for abusefiltercheckmatch. There were some patterns in there that had caused thousands of false positives. Oh and (?x) works wonders.Suffusion of Yellow (talk) 22:48, 4 April 2022 (UTC)[reply]
    Nice bit of tidying, thanks. -- zzuuzz (talk) 10:09, 5 April 2022 (UTC)[reply]
    And did a similar de-cruft of 102. "Only" looked at the last five years of the log. If I removed someone's "favorite" string, please restore, but consider providing some context in the notes <grumble grumble>... Suffusion of Yellow (talk) 02:00, 8 April 2022 (UTC)[reply]

New proposal[edit]

There are two problems with the account creation filters (102, 874, 1196): (1) The standard disallow message that we're using doesn't make any sense and is BITEy. (2) No matter what message we use, disallowing automatic account creations is always BITEy. What are they supposed to do, create two accounts? Expose their IP on EF/FP/R? Ping an enwiki admin from meta? So let's:

(1) Set filter 102, 874 and 1196 to match on manual account creation only, and use a message like this:

After all, no one should expect their first (or second or third) choice on any popular website. All the cool names are taken. Just saying "pick something else" isn't a big deal IMO. But talking about "abuse" and "disruption" and "blocking" and such; let's not do that. Note that the I didn't link to WP:EF/FP/R intentionally; if the account doesn't exist yet, they have to expose their IP to make the report.

(2) Create a new filter (we'll call it "Persistent LTA usernames" or something) matching on manual and automatic creation for use only against LTAs who evade the other filters. Everything added to this filter must:

  • Include the date that it was added
  • Include some context as to why it was added (log id of account creation, SPI page, whatever)

Anything will no true positives in a year, or anything added without a date or explanation will be removed.

I have no idea what kind of message to use for this new filter. Most autocreations are probably from non-native English speakers anyway. But admins should try to monitor the log, and force local creation for anything that looks like a FP. Since it's only a last resort, the log should ideally be pretty sparse. Suffusion of Yellow (talk) 18:38, 8 April 2022 (UTC)[reply]

Edit filter helper privilege for 4nn1l2[edit]

The following discussion is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.

Earliest closure has started. (refresh)

4nn1l2 (t · c · del · cross-wiki · SUL · edit counter · pages created (xtools • sigma· non-automated edits · BLP edits · undos · rollbacks · logs (blocks • rights • moves) · rfar · spi) (assign permissions)(acc · ap · fm · mms · npr · pm · pcr · rb · te)

I'm an admin on Commons and would like to be able to see edit filters on English Wikipedia. 4nn1l2 (talk) 02:25, 26 March 2022 (UTC)[reply]

  • No objections here. -- zzuuzz (talk) 18:55, 28 March 2022 (UTC)[reply]
  • Support, per the second EFH criteria: Those working with edit filters on another WMF wiki who want to learn from the English Wikipedia's experience and approach 🐶 EpicPupper (he/him | talk) 02:47, 29 March 2022 (UTC)[reply]
  • Support per EpicPupper. No objections here. Face-smile.svg3PPYB6TALKCONTRIBS — 12:50, 29 March 2022 (UTC)[reply]
 Donexaosflux Talk 16:41, 29 March 2022 (UTC)[reply]
The discussion above is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.

Regarding filter 958[edit]

958 (hist · log)

Hello, edit filter helpers/managers!

I was reviewing filter 958, but I noticed that you didn't list all the IPs of the U.S. Congress under WP:SIP. To provide better guidance, I'd like you to add 137.18.0.0/16, 12.185.56.0/29, 12.147.170.144/28, 74.119.128.0/22, 2620:0:E20::/46, 2620:0:8A0::/48, and 2600:803:618::/48, in case any of them edit.

If you cannot add these to the filter, then that is fine. — 3PPYB6TALKCONTRIBS — 17:07, 28 March 2022 (UTC)[reply]

@Legoktm—Courtesy ping as you created this filter. — 3PPYB6TALKCONTRIBS — 17:59, 28 March 2022 (UTC)[reply]
This filter is run for all anonymous edits, and it checks each edit against each IP range. For the sake of efficiency it makes sense for the filter to only use ranges which are actively used. The above are not. BTW I'm not really sure of this filter's general utility, since you could just check the contribs. Does anyone think it's useful? -- zzuuzz (talk) 18:55, 28 March 2022 (UTC)[reply]
This is one of those cases where "conditions" are a really poor measure of performance. An IP range check literally requires converting a few numbers from base 10, then checking against a bitmask. Probably takes a few hundred CPU cycles. But, yes, each check burns through one of our precious 1000 conditions, so I'd rather not do this unless either (A) the filter is converted to use regex, or (B), AbuseFilter is patched so that ip_in_range() takes multiple arguments.
And no, I don't see the point of the filter either. Suffusion of Yellow (talk) 20:01, 28 March 2022 (UTC)[reply]
@Suffusion of Yellow—In that case, I'll just say this: on second thought, merely checking WP:SIP should be enough for you to tell any congressional staffer's edits. If you feel that the filter is no longer required, feel free to delete it. — 3PPYB6TALKCONTRIBS — 21:36, 28 March 2022 (UTC)[reply]
Don't particularly see the point of the filter either. I remember browsing its hits out of general interest before, to see what Congressional IPs are editing, but not sure of the maintenance purpose here (and even if there were one, what makes US Congress edits distinct to various other countries' parliamentary IPs, which aren't filter-logged?) ProcrastinatingReader (talk) 21:43, 3 April 2022 (UTC)[reply]
 Done Disabled the filter. Suffusion of Yellow (talk) 02:04, 8 April 2022 (UTC)[reply]

Edit filter helper rights for EpicPupper[edit]

The following discussion is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.

EpicPupper (t · c · del · cross-wiki · SUL · edit counter · pages created (xtools • sigma· non-automated edits · BLP edits · undos · rollbacks · logs (blocks • rights • moves) · rfar · spi) (assign permissions)(acc · ap · fm · mms · npr · pm · pcr · rb · te)

Hi there! I've been frequently helping out at the edit filter false positives board, but it's been difficult for me to evaluate filters that are private, as, well, I can't see them. Additionally, I'm a recent changes patroller, and it is helpful to be able to evaluate private edit filters, including ones relating to LTAs and the like. Thank you for your consideration. 🐶 EpicPupper (he/him | talk) 03:05, 29 March 2022 (UTC)[reply]

  • Oppose. You only started contributing to EFFPR under a month ago, and I don't see sustained high activity levels in antivandalism either – I'm sorry, but I don't think I can support granting a right as sensitive as EFH at this time. --Blablubbs (talk) 14:35, 29 March 2022 (UTC)[reply]
  • Oppose per Blablubbs. Thank you for your contributions, but I don't think its been enough --DannyS712 (talk) 23:25, 31 March 2022 (UTC)[reply]
  • Weak oppose My first instinct was to support, based on our interactions. But I don't see enough involvement with filters (yet), and as usual, I don't want to set a precedent where this becomes a rollback-like hat. FWIW I also opposed Danny's first request, but supported the second, so don't take this as a "never". Suffusion of Yellow (talk) 23:48, 31 March 2022 (UTC)[reply]
The discussion above is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.

Edit filter helper rights for The4lines[edit]

The following discussion is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.

Earliest closure has started. (refresh)

The4lines (t · c · del · cross-wiki · SUL · edit counter · pages created (xtools • sigma· non-automated edits · BLP edits · undos · rollbacks · logs (blocks • rights • moves) · rfar · spi) (assign permissions)(acc · ap · fm · mms · npr · pm · pcr · rb · te)

Hey everyone, I'm applying to have the EFM helper role so that I can view private filters and aide in the discovery and reporting of LTA accounts. I've been looking at edit filters to learn more about the patterns of LTA's and learn more about how to better combat them and deal with them, the only problem is that many of the filters which are used for tracking down edits are private which makes them impossible to look at them. Being involved heavily with CCI and copyright sometimes involves LTA's which are long time copyright infringers. Being able to track down their socks and being able to evaluate their contributions is really important, as many of you know. Being able to report and block then along with removing any copyrighted material they have added as quick as possible is high priority. I've been talking with @Oshwah: off wiki quite a bit recently, and he encouraged me to apply for this role. Thanks for taking the time to consider this request everyone. Signed,The4lines |||| (Talk) (Contributions) 04:34, 29 March 2022 (UTC)[reply]

  • I confirm that The4lines and I talk often off-wiki (on the "wiki discord"), and I endorse this application for EFH. Hell, I'd just let The4lines have the permissions and be done with it, but I felt that it was more important to have The4lines apply officially. I believe that giving The4lines the ability to view private filters would be helpful. If there are concerns, I will officially take The4lines "under my wing" and I will train The4lines myself - that's not a problem. I see no "red flags", and The4lines assured me that he would keep information private and use the permissions for the good of the project. I will be guiding The4lines and his interest in identifying LTA patterns, activity, and abuse. I will mentor and train The4lines, and I request that this application receive support. :-) ~Oshwah~(talk) (contribs) 05:01, 29 March 2022 (UTC)[reply]
  • I respect Oshwah's opinion, but I'm not sure I see a need for this right at this time. The4lines has made two edits to AIV this year and filed two SPIs - for someone requesting filter access to deal with LTAs, I would expect a more extensive track record in these areas. I also find the reference to CCI confusing as I don't think there are many private filters (if any at all?) that are relevant to CCI work. I think the candidate's stated aims could be accomplished by spending some time reading public LTA and SPI cases, without the need for access to other types of sensitive information that comes along with EFH. Spicy (talk) 01:22, 30 March 2022 (UTC)[reply]
    Confirming that edit filters have pretty much no bearing on CCI - there's 856 which is tangential, but that one's public anyway. firefly ( t · c ) 15:07, 1 April 2022 (UTC)[reply]
    This was my inclination also. However, @Oshwah: are there any particular filters you think The4lines would benefit from being able to view, given the work they've described they like to do in the OP? ProcrastinatingReader (talk) 15:40, 1 April 2022 (UTC)[reply]
    @ProcrastinatingReader: Hey! I don't believe Oshwah is around right now so I'll respond first. Looking back, 51, 53, 579, and 874 were the edit filters that Oshwah wanted me to look at. I'll wait for Oshwah's response though, since he might know more filters that pertain to the work I want to do. Noting two things quickly. First, I don't really want to know how edit filters work (yet), because I'm not very good with technical stuff. I just want to see who trips the filter. There's no experience really needed for looking at edit filters, there's no technical knowledge needed really in my opinion. Second, I don't only want to work with CCI, I also want to work with LTA's with Oshwah and stuff like that. Signed,The4lines |||| (Talk) (Contributions) 16:05, 1 April 2022 (UTC)[reply]
    Yes, those are the filters. ~Oshwah~(talk) (contribs) 01:59, 3 April 2022 (UTC)[reply]
  • I have no problem with this and trust Oshwah's judgement; the fact that he will take The4lines under his wing on this is also encouraging. --TheSandDoctor Talk 16:36, 3 April 2022 (UTC)[reply]
  • I am currently leaning towards opposing this request, largely per Spicy. I can see EFH being useful for someone who already has significant anti-abuse experience and is looking to help develop and track specific LTA filters, but I think it is by no means essential, especially given the nature of the filters mentioned in the request; I can't comment on specifics, but I will say that I would not consider them geared towards tracking complex abuse where filter access provides a strong benefit, and that I think they are mostly helpful to people with the admin bit. On a minor sidenote, I also couldn't find the referenced community Discord interactions from a quick search when I was looking for background, though I might well have missed them. --Blablubbs (talk) 21:53, 3 April 2022 (UTC)[reply]
    @Blablubbs: The reason you can’t see them is because they’re DM or private messages between me and Oshwah. Signed,The4lines |||| (Talk) (Contributions) 00:25, 4 April 2022 (UTC)[reply]
  • @Oshwah: I'm inclined to close this as no-consensus right now, have been holding off to see if more comments came in, having your endorsement is a major "positive" in this discussion and I'm hesitant to derail a new mentorship. EFH isn't very "powerful" but it does have a modest degree of trust - because once added every private filter and almost all former private filter logs are accessible - so it comes down to (a) could this be useful, which seems to be "yes" and (b) is this person trustworthy enough to maintain private information - which is where I'm seeing the no-consensus right now. Will you be able to move forward with some training with The4lines in the absence of this - and perhaps they could reapply in say 6 months? Best regards, — xaosflux Talk 16:16, 7 April 2022 (UTC)[reply]
    Pinging @Oshwah. @Xaosflux To be honest, I thought that the need was what everyone was leaning towards oppose. For the trust aspect, having two established editors support me is more than enough trust, and no one said anything about trust. Sorry for coming off a little bit eager, but I’m open to 6 months. I’ll like to see what oshwah says. Best, Signed,The4lines |||| (Talk) (Contributions) 03:19, 8 April 2022 (UTC)[reply]
    @The4lines: it requires both, my comment was that while the "need" is a necessary aspect for this - it isn't as important as the trust aspect for this flag, which is why in closing this I didn't give as much weight to the need arguments and have closed as nocon instead of just outright oppose. — xaosflux Talk 10:52, 8 April 2022 (UTC)[reply]
    Xaosflux - That's completely fine; go ahead and close this request as 'no consensus reached'. I'll take some time to have The4lines take part and get proficient with reverting bad-faith edits, harassment, and vandalism, filing good reports at AIV, RFPP, UAA, and eventually SPI. Thanks for responding, and I hope you're having a great evening. :-) ~Oshwah~(talk) (contribs) 06:20, 8 April 2022 (UTC)[reply]
The discussion above is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.

Any objections to re-setting 1017 to disallow?[edit]

It was downgraded to tag owing to FPs, but these days I can't see that there's been a false positive in months. It's very effective at picking up its intended targets and disallowing might help limit the amount of revdel required. firefly ( t · c ) 16:53, 2 April 2022 (UTC)[reply]

+1. If FPs do become an issue, it would be better to split it into a disallow and tag-only, since some of those have a 0% chance of being used in good faith, and the first alternative in particular appears in >80% of ESes at least from the latter of the two LTAs using it. On that note, should the filter be renamed to note that it catches two LTAs now? CUs have said they're not the same person, and that was my takeaway from a long behavioral investigation into the newer one. (If this were to be split into disallow and tag-only, the tag filter could also look for a pattern matching variants of the username they've used on-and-off on a variety of sites for 4-5 years, which often comes up in their edits but would probably be too FP-prone.) -- Tamzin[cetacean needed] (she/they) 18:38, 2 April 2022 (UTC)[reply]
@Tamzin filter renamed following the standard scheme firefly ( t · c ) 18:56, 2 April 2022 (UTC)[reply]
I would support moving this to disallow, based on the last month of filter hits. ProcrastinatingReader (talk) 13:33, 3 April 2022 (UTC)[reply]
 Done - set to disallow with the standard message given the daily shenanigans from its target. If false-positive issues arise I'd second Tamzin's idea of a narrowly-constructed disallow filter and a wider tagging one. firefly ( t · c ) 17:55, 3 April 2022 (UTC)[reply]