https://github.com/LemmyNet/lemmy/issues/3245

I posted far more details on the issue then I am putting here-

But, just to bring some math in- with the current full-mesh federation model, assuming 10,000 instances-

That will require nearly 50 million connections.

Each comment. Each vote. Each post, will have to be sent 50 million seperate times.

In the purposed hub-spoke model, We can reduce that by over 99%, so that each post/vote/comment/etc, only has to be sent 10,000 times (plus n*(n-1)/2 times, where n = number of hub servers).

The current full mesh architecture will not scale. I predict, exponential growth will continue to occur.

Let’s work on a solution to this problem together.

  • HTTP_404_NotFound@lemmyonline.comOP
    link
    fedilink
    English
    arrow-up
    0
    ·
    edit-2
    1 year ago

    Its not wrong- we just have opposite ideas here-

    The 50 million, is based on the formula for a full-mesh network. Where all instances talk to each other. In the case of lemmy, this would be an absolute worst-case scenario, where every instance, is subscribed to a community on every other instance.

    In your example of only 10,000 messages, you are assuming that of the 10,000 instances in existence, they are ONLY looking at a single community, on a single server.

    Lets say, those 10,000 instances all decide to look at a community on another server. Now you have 20,000 connections.

    Lets add another community, hosted on yet another instance. That is 30,000 connections.

    TLDR;

    My example, is based on worst-case scenario. (A pretty unachievable one at that!)

    Your example, is based on best-case scenario.

    Realistically, the actual outcome would be somewhere much closer to best-case scenario(As communities seem to lump up on the big servers). However, for planning architecture, you always assume worse-case scenario.

    • bdonvr@thelemmy.club
      link
      fedilink
      English
      arrow-up
      1
      ·
      1 year ago

      No - you said:

      Each comment. Each vote. Each post, will have to be sent 50 million seperate times.

      That won’t ever happen. Unless there’s 50 million instances. That’s not worst case, it’s just not a case.

      There is no case in the current implementation where any one action is replicated more times than there are total instances.

      And it doesn’t matter what “model” you assume, each action will have to federate to each instance eventually. That count is minimally, the total number of instances.

      Lets say, those 10,000 instances all decide to look at a community on another server. Now you have 20,000 connections.

      Looking does nothing, each instance hosts essentially a copy of the “host instance” for each community. Only interactions (comments, likes, posts, etc) are federated.

      • HTTP_404_NotFound@lemmyonline.comOP
        link
        fedilink
        English
        arrow-up
        2
        ·
        edit-2
        1 year ago

        for fucks sake, dude, be collaborative, and not defensive. This isn’t reddit, I am not out to attack your karma.

        If every instance, hosts a community, and Every other instance, subscribes to every one of those communities, that would lead to a full-mesh between all instances, resulting in worst-case scenario, ie, following the formula I provided for a full-mesh topology.

        That is indeed, the worst case scenario, I have provided, explained, and documented in my examples.

        If my example is too hard to understand, lets use an easier example

        Count the number of instances on https://lemmy.ml/instances

        Assume every one of those instances subscribes to !asklemmy.

        Now, count the number of instances on https://lemmy.world/instances

        Assume, every one of those instances subscribes to !lemmyworld.

        Now, count the number of instances on https://beehaw.org/instances

        Assume, every one of those instances subscribes to !technology.

        It does. not. scale.

        • delcake@lemmy.songsforno.one
          link
          fedilink
          English
          arrow-up
          2
          ·
          1 year ago

          In no way is the person you’re responding to speaking defensively. They’ve discussed the reason why your extrapolation to a full-mesh connective worst-case scenario isn’t based in the reality of how ActivityPub functions. But you don’t seem to be willing to entertain the notion that the federation of any given action never exceeds the number of instances subscribed to the community that generated it.

          Even should every instance subscribe to every community on every other instance, the recipient of a federated action doesn’t turn around and rebroadcast that action back on to the network because it is not the authoritative host of that community. Therefore what this discussion is lacking is proof of where this exponential broadcast storm of federated actions comes from in your assertion.

        • King@vlemmy.net
          link
          fedilink
          English
          arrow-up
          0
          ·
          1 year ago

          Yes, it is a “full mesh” diagram. But for each specific “federated” action, it is a simple hub and spoke distribution. The hosting server will send the federated action to each subscribed node. The nodes don’t need to check in with each other for that specific action.

          I too believe that Federation is going to have scaling issues. But not due to full mesh

          • HTTP_404_NotFound@lemmyonline.comOP
            link
            fedilink
            English
            arrow-up
            0
            ·
            1 year ago

            I am onboard with you there-

            But, would not not agree- delegating and offloading those federation actions to a dedicated pool of servers, would not assist scalability?

            That way- each instance doesn’t need to maintain all of the connections?

            • King@vlemmy.net
              link
              fedilink
              English
              arrow-up
              0
              ·
              1 year ago

              There is no need to “maintain all of the connections”. The server opens a connection, sends the data, then closes the connection.

                • Fauxreigner@beehaw.org
                  link
                  fedilink
                  English
                  arrow-up
                  0
                  ·
                  1 year ago

                  Federation isn’t working well, but it’s not working well because the big instances aren’t able to keep up with all of the inbound/outbound messages, and if a message fails, that’s it. Right now there’s no automated way to resync and catch up on missed activity.

                  • cyd@vlemmy.net
                    link
                    fedilink
                    English
                    arrow-up
                    1
                    ·
                    1 year ago

                    How was syncing done in Usenet? It has a very similar decentralized model, and I don’t recall there being problems of data loss due to desyncing between servers.