• CeeBee@lemmy.world
    link
    fedilink
    English
    arrow-up
    0
    ·
    edit-2
    7 months ago

    There are so many ways this can be done that I think you are not thinking of.

    No, I can think of countless ways to do this. I do this kind of thing every single day.

    What I’m saying is that you need to account for every possibility. You need to isolate all the deleted comments that fit the criteria of the “Reddit Exodus”.

    How do you do that? Do you narrow it down to a timeframe?

    The easiest way to do this is identify all deleted accounts, find the backup with the most recent version of their profile with non-deleted comments, and insert that user back into the main database (not the prod db).

    Now you need to parse billions upon billions upon billions of records. And yes, it’s billions because you need the system to search through all the records to know which record fits the parameters. And you need to do that across multiple backups for each deleted profile/comment.

    It’s a lot of work. And what’s the payoff? A few good comments and a ton of “yes this ^” comments.

    I sincerely doubt it’s worth the effort.

    Edit: formatting

    • yeehaw@lemmy.ca
      link
      fedilink
      English
      arrow-up
      0
      ·
      7 months ago

      How do you do that? Do you narrow it down to a timeframe?

      When a user edits a comment, they submit a response. When they submit a response, they trigger an action. An action can do validation steps and call methods, just like I said above, for example. When the edit action is triggered, check the timestamp against the previously edited comment’s timestamp. If the previous - or previous 5 are less than a given timeframe, flag it. “Shadowban” the user. Make it look like they’ve updated their comments to them, but in reality they’re the same.

      We’ve had detection methods for this sort of thing for a long time. Thing about how spam filtering works. If you’re using some tool to scramble your data, they likely have patterns. To think reddit doesn’t have some means to protect itself against this is naive. It’s their whole business. All these user submitted comments are worth money.

      Now you need to parse billions upon billions upon billions of records. And yes, it’s billions because you need the system to search through all the records to know which record fits the parameters. And you need to do that across multiple backups for each deleted profile/comment.

      This makes me thing you don’t understand my meaning. I think you’re talking about one day reddit decides to search for an restore obfuscated and deleted comments. Yes, that would be a large undertaking. This is not what I’m suggesting at all. Stop it while it’s happening, not later. Patterns and trends can easily identify when a user is doing something like shreddit or the like, then the code can act on it.

      It’s a lot of work. And what’s the payoff? A few good comments and a ton of “yes this ^” comments.

      this