Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Subscription Memory Improvements #1059

Draft
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

benwilson512
Copy link
Contributor

@benwilson512 benwilson512 commented Mar 30, 2021

The idea here, per a conversation with @josevalim is to try to minimize the amount of information that is duplicated when clients subscribe to multiple documents.

The largest value that is generally duplicated today is the context. Initial results show:

5x improvement for clients with lots of subscriptions per client (150).
3x Improvement for clients with a low number of subscriptions per client (7).

Those numbers double to 10x and 6x on Elixir master where Registry supports compressed :ets tables.

Approach

The main idea here is that Absinthe adds a second Registry with :unique keys, and we'll use this registry to manage singular values (like the context) associated with that process.

Current Issues

  • A process is only allowed to have a single context. This isn't necessarily true.
  • Unregister probably doesn't work
  • The lack of failing test cases highlights some missing coverage around context management.

@@ -118,13 +119,14 @@ defmodule Absinthe.Subscription do
}
}

{:ok, _} = Registry.register(registry, field_key, doc_value)
{:ok, _} = Registry.register(registry, {self(), doc_id}, field_key)
_ = Registry.register(uniq_registry, {self(), :context}, doc.execution.context)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is going to have to match on {:already_registered, fetch the current context, and then have the value be some sort of map of %{context_id => context}.

This complicates unregister because it'll need to look at the counter and only remove the context when there are no documents remaining with that context_id. This may not be very efficient to lookup.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some general thoughts:

  1. You don't need to store all information on the registry, only the information that you are going to read from another process. So for example, the map could be part of the process dictionary, or you could return a ref on subscribe which people need to use to unsubscribe (at least if they want it to be fast)

  2. Could you store {self(), context_id} instead of :context?

  3. Maybe you don't need two registries. You can keep this as a duplicate registry and you do a read before you figure out if you need to store the context. On the other hand, lookups on duplicate registries are expensive, because you need to traverse all partitions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the other hand, lookups on duplicate registries are expensive, because you need to traverse all partitions.

I just pushed Registry.values/3 that allows you to read a key for a given pid, this makes looking up a key for a given pid efficient for both unique and duplicate registries.

There is also an argument to add a has_key? function but I am waiting a bit until we are sure we will need it.

{:ok, _} = Registry.register(registry, {self(), doc_id}, field_key)
_ = Registry.register(uniq_registry, {self(), :context}, doc.execution.context)
{:ok, _} = Registry.register(dup_registry, field_key, doc_value)
{:ok, _} = Registry.register(dup_registry, {self(), doc_id}, field_key)
Copy link
Contributor Author

@benwilson512 benwilson512 Mar 30, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Part of the reasons we have all of these keys is that we basically are using it to stash Absinthe specific meta-data about a given pid and its subscriptions somewhere. We could use the process dictionary, since the whole point is that it is process specific state. Feels dirty though, and it wouldn't help with the context storage because that needs to be accessible to other pids. Mostly it would clean up unsubscribe inefficiencies.

@benwilson512 benwilson512 force-pushed the further-memory-improvements branch from 77a91d3 to e5674b7 Compare April 23, 2021 16:17
@beligante
Copy link
Contributor

@benwilson512

In my company, we have an excellent use case where we could benefit from what this PR was addressing originally and we might have time to work on this in the next few days. Do you think this is something you and the Absinthe team still want to address? We'll be glad to contribute if you're interested, if yes is there is any advice for moving it forward?

@benwilson512
Copy link
Contributor Author

and we might have time to work on this in the next few days.

Hey that's exciting to hear! Please shoot me an email at ben [dot] wilson [at] cargosense [dot] com I'd love to get a minute to chat with you about it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants