Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Duplicate storage for the same component #1063

Closed
hfn92 opened this issue Sep 10, 2023 · 41 comments
Closed

Duplicate storage for the same component #1063

hfn92 opened this issue Sep 10, 2023 · 41 comments
Assignees
Labels
documentation docs related issues/requests solved available upstream or in a branch

Comments

@hfn92
Copy link

hfn92 commented Sep 10, 2023

I've been running into the problem that some components went "missing". After investigating I noticed at some point a second storage for the same type is being created.

I used following code for outputting info (trying with different namespaces and without typedef)

      TRACE << " E        " << (Uint32)e.entity();
      TRACE << " R        " << e.registry();
      TRACE << " S        " << &e.registry()->storage<::Children>();
      TRACE << " S (size) " << e.registry()->storage<::Children>().size();
      TRACE << " T1       " << entt::type_id<::Children>().hash();
      TRACE << " T2       " << entt::type_id<std::vector<entt::entity>>().hash();

I was able to get two different outputs right after creating the entity

 E        418
 R        0x55a490312658
 S        0x55a490341680
 S (size) 3
 T1       2301207559
 T2       2301207559
 E        418
 R        0x55a490312658
 S        0x55a4903d5e90 // different storage
 S (size) 0
 T1       2301207559
 T2       2301207559

There now seems be be a new storage for the type Children.

When printing out all storages with

for (auto [n, s] : e.registry()->storage())
{
    TRACE << s.type().name() << " : " << &s << (entt::type_id<::Children>().hash() == n ? " <<<<" : "");
}

I get following output (:exclamation: this is from a different run so the pointers are different from above):

std::vector<entt::entity> : 0x55e27bf4cbe0 <<<<
[...]
std::vector<entt::entity> : 0x55e27bd9eb30

Things to note

  • I started to notice it only in release mode
  • This might have started suddenly without any code change regarding this entity (but I'm not sure)
  • Happens in code that is part of the same executable (not between shared libs or anything like that)
  • I've noticed this even in the same compilation unit!
  • Hard to reproduce. Slight changes in code affect this behaviour.
  • compiler is gcc 13.2.1 20230801
  • entt version 3.12.2

I suspect it has something to do with hashes being created differently, since in the output above only one of the strorages hash matches. But I'm a bit at a loss here on how to proceed 😅

@skypjack skypjack self-assigned this Sep 11, 2023
@skypjack skypjack added the triage pending issue, PR or whatever label Sep 11, 2023
@skypjack
Copy link
Owner

It would be the first case of conflict in many years but anything is possible. How is Children defined?
You can try to give it a user defined hash that is different from the auto-generated one.
If you're right on the root cause, this would solve the problem at once.

@hfn92
Copy link
Author

hfn92 commented Sep 11, 2023

How is Children defined?

using Children = std::vector<entt::entity>;

You can try to give it a user defined hash that is different from the auto-generated one.

You mean like this?

namespace entt
{
template<>
struct type_hash<::Children, void> final
{

  [[nodiscard]] static constexpr id_type value() noexcept
  {
    return ~0;
  }

  [[nodiscard]] constexpr operator id_type() const noexcept
  {
    return value();
  }
};
} // namespace entt

This solve the issue for me. Hash is forced to 4294967295 and I only get one storage

 E        418
 R        0x55c8934b98e8
 S        0x55c893445970
 S (size) 3
 T1       4294967295
 T2       4294967295
[...]
std::vector<entt::entity> : 0x55c893445970 <<<<
[...]

@skypjack
Copy link
Owner

Wait a moment.
std::vector<entt::entity> and Children ARE the same type. It's not a conflict.
using declarations don't introduce new types. They define aliases and nothing more.

@hfn92
Copy link
Author

hfn92 commented Sep 11, 2023

I know. I just wanted to make sure std::vector<entt::entity> and Children are resolved to the same hash.

EDIT: Also my problem is that I get multiple storages for the same type (std::vector<entt::entity>). When I should just get one storage.

@skypjack
Copy link
Owner

A repro would help but 🤷‍♂️ no idea what you're doing that can cause this, I'm sorry.
I'm glad the custom hash fixed it though. I'm afraid I can't do much more for this issue.

@hfn92
Copy link
Author

hfn92 commented Sep 11, 2023

This looks like a nasty bug somewhere

When I print the storage at my first usage I get:

TRACE << "First usage : " << i.registry()->storage<Children>().type().hash() << " "
          << &i.registry()->storage<Children>();
Output:
First usage : 2301207559 0x560da62d7fc0

When I then later print all storages I get:

(name, address, hash)
std::vector<entt::entity> : 0x560da6526a10 2301207559
std::vector<entt::entity> : 0x560da62d7fc0 2361607097

It looks like the hash of the initial storage at 0x560da62d7fc0 changed.
The hash seems to be computed as 2301207559 consistently throughout the program. So I'm really unsure as to why the custom hash helped at all.

I'll try to get minimal repro but I don't have a lot of hope 😆

EDIT: This info is wrong. The storage already existed before that. Created by on_destruct. But the first emplace creates the second storage with the wrong hash.

@hfn92
Copy link
Author

hfn92 commented Sep 11, 2023

Two new things.

  1. Defining ENTT_STANDARD_CPP also helps.

  2. I added debug printing to the entt code (basic_registry::assure).

  template <typename Type>
  [[nodiscard]] auto &
  assure([[maybe_unused]] const id_type id = type_hash<Type>::value()) {
    if constexpr (std::is_same_v<Type, entity_type>) {
      return entities;
    } else {
      static_assert(std::is_same_v<Type, std::decay_t<Type>>,
                    "Non-decayed types not allowed");
      auto &cpool = pools[id];

      if (!cpool) {
        using storage_type = storage_for_type<Type>;
        using alloc_type = typename storage_type::allocator_type;

        auto op = type_id<Type>();
        printf("Type=%s \ntype_id<Type>().hash()=%u \nid=%u "
               "\ntype_hash<Type>::value()=%u\n",
               op.name().data(), op.hash(), id, type_hash<Type>::value());

        if constexpr (std::is_same_v<Type, void> &&
                      !std::is_constructible_v<alloc_type, allocator_type>) {

And for std::vector<entt::entity> it prints

Type=std::vector<entt::entity>] 
type_id<Type>().hash()=2301207559 
id=2361607097 
type_hash<Type>::value()=2301207559

The id that is passed as default value is inconsistent with those computed in the method.

While for other types it"s consistent

Type=Light] 
type_id<Type>().hash()=2329082063 
id=2329082063 
type_hash<Type>::value()=2329082063

Any ideas? Is this a compiler issue?

@skypjack
Copy link
Owner

To be honest, it really really really smells of UB. As if you were a moved-from registry or a dangling reference. Sort of.
Have you already considered this and investigated to exclude it? Without a repro, I'm just trying to think aloud but it's very difficult to help. I'm sorry.
Is this a public project btw? Maybe I can try to reproduce it locally.

@hfn92
Copy link
Author

hfn92 commented Sep 12, 2023

To be honest, it really really really smells of UB. As if you were a moved-from registry or a dangling reference. Sort of. Have you already considered this and investigated to exclude it?

Yes I'm definitely using the same registry (you can see the same pointers for the registry in the first post). I'm also adding thousands of components in the same function, but only std::vector<entt::entity> causes issues. Running the program with address sanitizer or undefined behavior sanitizer shows no issues as well.

Also I don't see how UB would cause id = type_hash<Type>::value() (function arg) to be different from type_hash<Type>::value() (function body) in the same method call. This should be constexpr right? Imo this has to be an issue with the compiler...

Without a repro, I'm just trying to think aloud but it's very difficult to help. I'm sorry.

Sorry I wish I had one 😭 But all I'm doing is

reg.emplace<std::vector<entt::entity>>(e)

with different random result.

Is this a public project btw? Maybe I can try to reproduce it locally.

No sorry,

I realize this is difficult to remote diagnose. Still I really appreciate your help 👍

@hfn92
Copy link
Author

hfn92 commented Sep 12, 2023

Btw if I change Children to

using Children = llvm::SmallVector<entt::entity, 4>;

I also have no issues ... I guess the hasher does not like std::vector<entt::entity> 😆

@skypjack
Copy link
Owner

Maybe the using declaration gives it a headache? I guess you tried to reproduce it like this, right?
I can't really explain why it should be a problem but I don't have access to the real code either, so 🤷‍♂️ hard to say.

@Green-Sky
Copy link
Contributor

I personally would not use using to specify a component type. You could use
class Children : public std::vector<entt::entity> {} or similar to create a new type.

@hfn92
Copy link
Author

hfn92 commented Sep 12, 2023

I personally would not use using to specify a component type. You could use class Children : public std::vector<entt::entity> {} or similar to create a new type.

I don't want to create a new type. I was just using a typedef for convenience. The issue persists regardless of using the typedef or std::vector<entt::entity> directly.

@hfn92
Copy link
Author

hfn92 commented Sep 13, 2023

I have a repro, but I almost certain you won't get the same result 😞
https://github.com/hfn92/entt-repro

This is my output. It shows two storages with the same type but different hash. (The hash of the type in storage is different from the has used as map key)

2361607097 : std::vector<entt::entity, std::allocator<entt::entity> >
2301207559 : std::vector<entt::entity, std::allocator<entt::entity> >
Error detected!!! 2301207559 vs 2361607097

Minor unrelated changes such as this already fix this issue ...

--- a/Prototypes.h
+++ b/Prototypes.h
@@ -11,6 +11,6 @@ class Prototypes
 public:
   static void Create(entt::registry& reg, std::string_view i)
   {
-    entt::handle h{ reg, reg.create() };
+    // entt::handle h{ reg, reg.create() };
   }
 };

@skypjack
Copy link
Owner

Ehy, sorry for the late reply, I've been busy last week. Gonna try it soon. 👍

@skypjack
Copy link
Owner

I can't reproduce your result with this code. To be honest, I don't even see where Prototypes is used and therefore I don't see how changing it would change anything. As far as I can tell, any decent compiler should ignore and discard this class since it's irrelevant. 🤔
Are you sure you pushed the right code?

@hfn92
Copy link
Author

hfn92 commented Sep 18, 2023

I can't reproduce your result with this code.

Yeah I didn't think it would work for you 😅

I don't even see where Prototypes is used and therefore I don't see how changing it would change anything. As far as I can tell, any decent compiler should ignore and discard this class since it's irrelevant. 🤔
Are you sure you pushed the right code?

The class is not used. The code is utter nonsense since I just removed code from my engine just until before the error would stop occurring.
I still think this I a compiler issues unless entt invokes UB during the constexpr calculation of the hash, which I don't believe. So unless you use gcc 13.2.1 20230801 in RelWithDebInfo you wont get the same result.

@hfn92
Copy link
Author

hfn92 commented Sep 18, 2023

I have confirmed that the compiler uses two different strings to compute the hash during compile time.

std::vector<entt::entity, std::allocator<entt::entity> >  // "correct"
std::vector<entt::entity>

With hashes

2361607097   // "correct"
2301207559

I have no idea what to do with that information 🤣

@skypjack
Copy link
Owner

I mean, have you tried with a different compiler too? Like, dunno, clang or even gcc 14?
I'm not 100% convinced that your error is due to a compiler issue but I can't help much if I can't reproduce it.

@GavinNL
Copy link

GavinNL commented Sep 22, 2023

Hi @hfn92 , @skypjack

I want to chime in here, because I'm getting some very similar behaviour as well. I have a project that was running perfectly using entt 3.6, but recently updated it to 3.12 and I noticed one of my unit tests was failing. But it would only fail when running in the Continuious Integration system we are running.

I too tracked problem down to having two storage locations for the same type. When creating the object it would create one storage, then when looking it up, it would search in another storage location.

My component is also a std::vector<some_type> so maybe it has to do with that.

I got the same error using gcc8 and gcc11. But ONLY when they are running inside a docker container. When I compile the same unit tests on my main system (which use gcc11), all the tests pass and I have no problems.

I tried the suggested solution of implementing a custom type_hash function. This worked! Thank you!

I also tried not using std::vector as a component, but instead defined a wrapper struct around the vector like so:

template<typename T>
struct VectorComponent
{
    std::vector<T> _vector;
}

This also seemed to work. So the problem may be due to how std::vectors are being type_hashed. It may be a compiler bug. I will try to compile using clang and see if I get the same issue

@Green-Sky
Copy link
Contributor

To be clear, the type hash relies on a string, which is implementation defined behavoir (/ compiler specific extension?), and as such is just best effort. But it rarely breaks. eg: I had to define the hashes myself, because I went across the compiler boundary.

@GavinNL if you just want it to work and are ok with extra types (like the VectorComponent<T>) you can just inherit from the original type and you don't need to adjust code (like with a wrapper).
eg class Children : public std::vector<entt::entity> {} or similar.

@GavinNL
Copy link

GavinNL commented Sep 23, 2023

Thanks for the info @Green-Sky . I was going to inherit from the vector, but I heard it's bad practice to inherit from std containers. I was only using that component in one place, so it was pretty easy to fix.

I just did some testing with the hashs here's what happened on my tests.

  • This problem only occurred on GCC8 and GCC11 in Ubuntu Docker containers. It worked fine on my main Linux Mint host
  • It works fine on MSVC and Clang 14
  • implementing the type_hash<std::vector<MyType> > fixed the problem on GCC but MSVC started failing unit tests. Didn't look into this too much, I just ended up wrapping the type_hash implementation in #if blocks

The type_hash is used to basically get a number so that it can be used as a key for a map, correct? Is there a reason std::type_index wasn't used instead? constexpr reasons, maybe?

@skypjack
Copy link
Owner

This problem only occurred on GCC8 and GCC11 in Ubuntu Docker containers

This is interesting though. We have these compilers on the CI, so in theory we can reproduce the issue with a minimal repro?

@GavinNL
Copy link

GavinNL commented Sep 25, 2023

Hi @skypjack , I will try to get you a minimal code that reproduces this error

@skypjack
Copy link
Owner

Not sure if this helps to fix it but it would be great if we had something that fails on the CI because I cannot reproduce the error locally and I don't have gcc11 up and running locally, I'm sorry. 🙏

@GavinNL
Copy link

GavinNL commented Sep 25, 2023

I'll do my best. The main problem was using std::vector<SomeStruct> as a component on an entity. When creating the component, the has that was being calculated (and therefore the storage) for the component was different from when looking up the component using registry.get< std::vector<SomeComponent> >(e)

It was a really odd issue because it wasn't very inconsistent in when it would fail. Only my unit tests were failing, my actual application was running fine. I had the registry.emplace< std::vector<SomeComponent> >(e) in a header file, while registry.get< std::vector<SomeComponent> >(e) was in a source file

Its likely the "string" that's being generated by gcc that is used for computing the hash for std::vector was not being generated the same way in different parts of the code.

@GavinNL
Copy link

GavinNL commented Sep 25, 2023

I just tested @hfn92 example code on GCC11 and I'm getting the same errors he is.

I also tested in Clang 14 debug/release, they both seem to be okay

I can confirm that the error is happening on GCC Release builds, it does not happen on Debug. That would explain why my tests were failing in the CI. Our CI was building in release mode, and I was building in debug mode locally.

@skypjack ,Is your CI building in debug mode or release mode?

Screenshot from 2023-09-25 19-26-20

@GavinNL
Copy link

GavinNL commented Sep 25, 2023

Here's @hfn92 test, with some cmake and conan files. This definitely fails on my end

entt_test.zip

My system is: Linux Mint 21/Ubuntu 22.04, GCC11.4 + release build

@skypjack
Copy link
Owner

Thank you a lot. That's really funny btw. I wonder what changes in release mode on the gcc side.

@GavinNL
Copy link

GavinNL commented Sep 26, 2023

I just did some additional testing. In the code I gave you, there are two calls to emplace<Children> one in main and one in Create(). If you add the following right after each call to emplace

    std::cout << "Calling  emplace<Children>()" << std::endl;
    constexpr auto stripped = entt::internal::stripped_type_name<Children>();
    constexpr auto value = entt::hashed_string::value(stripped.data(), stripped.size());
    std::cout << "   Stripped name :" << stripped << std::endl;
    std::cout << "   hashed   name :" << value << std::endl;

This is the output I got when running with GCC11.4+Release+Ubuntu22.04

In Create
Calling  emplace<Children>()
   Stripped name :std::vector<entt::entity, std::allocator<entt::entity> >
   hashed   name :2361607097
2361607097 : std::vector<entt::entity>
In Main
Calling  emplace<Children>()
   Stripped name :std::vector<entt::entity>
   hashed   name :2301207559
2361607097 : std::vector<entt::entity>
2301207559 : std::vector<entt::entity>
Error detected!!! 2361607097 vs 2301207559
Ok! 2301207559 == 2301207559
  • It looks like the stripped_name is different on each of the different calls resulting in a different hash being calculated

  • This seems to be happening when the two emplace< > are in different compilation units. Moving the Create() function to main.cpp causes it to work again and results in the following:

In Create
Calling  emplace<Children>()
   Stripped name :std::vector<entt::entity>
   hashed   name :2301207559
2301207559 : std::vector<entt::entity>
In Main
Calling  emplace<Children>()
   Stripped name :std::vector<entt::entity>
   hashed   name :2301207559
2301207559 : std::vector<entt::entity>
Ok! 2301207559 == 2301207559

@Innokentiy-Alaytsev
Copy link
Contributor

Could it be that at some point in the code the compiler is unable to see the whole type declaration, making it skip the allocator... part?

@GavinNL
Copy link

GavinNL commented Sep 27, 2023

Some more poking around...

In Prototype.h there is a class with a static member function. This header file is included in both cpp files, but its not being used.

If I comment out the static member function, I get the following, which seems like it's working. Note that the name does not contain the allocator part.

In Create
Calling  emplace<Children>()
   Stripped name :std::vector<entt::entity>
   hashed   name :2301207559
2301207559 : std::vector<entt::entity>
In Main
Calling  emplace<Children>()
   Stripped name :std::vector<entt::entity>
   hashed   name :2301207559
2301207559 : std::vector<entt::entity>
Ok! 2301207559 == 2301207559

I don't know which one should be considered the correct name, std::vector<entt::entity> or std::vector<entt::entity, std::allocator<entt::entity>> . Clang's name resolves to std::vector<entt::entity>

In release mode, I think if it can detect a function isn't being used, it doesn't include it in the compilation, but why that would change the name of the compile-time string, I don't know.

It might be worth asking a GCC developer. I don't know if this is a bug or not

@hfn92
Copy link
Author

hfn92 commented Sep 28, 2023

It might be worth asking a GCC developer. I don't know if this is a bug or not

I have, just no response yet :)

entt uses __PRETTY_FUNCTION__
which generates a string like
constexpr auto stripped_type_name() [with Type =std::vector<entt::entity>]
and then strips away the parts not needed.

If I comment out the static member function, I get the following, which seems like it's working. Note that the name does not contain the allocator part.

I remember getting a change in behavior from completely unrelated changes. Like here #1063 (comment)

I think std::vector<entt::entity> should be correct name. Because thats whats also used in debug mode.

@skypjack
Copy link
Owner

I find this one fascinating but I don't see what EnTT can do here. It really looks like an issue on the compiler side.
Fortunately, the library provides a tool to get around the problem and you confirmed that it works like a charm.

I'm open to suggestions though. If you think we can do more, please drop a few lines to describe your idea.
Otherwise, I think I'll close the issue as won't fix sooner or later because I don't see what can I do for that.

@hfn92
Copy link
Author

hfn92 commented Sep 28, 2023

I'm open to suggestions though. If you think we can do more, please drop a few lines to describe your idea.

When defining ENTT_STANDARD_CPP entt switches to another method of computing the hash, which I think solves the issues. But we seem to loose the ability the use .name() since it just returns an empty string. This makes debugging harder and breaks my ImGui ui for displaying compoments 😅

Would it be reasonable to provide a way to switch to the counting method for hashes while still retaining the name? In this case I don't mind that the string might not be "correct", since I just use it for display purposes

@skypjack
Copy link
Owner

Well, you can specialize type_name and make it return the pretty function name. EnTT is fully configurable in this sense.
Also, keep in mind that the 'ENTT_STANDARD_CPP` way doesn't work across boundaries because it uses local static variables. The standard doesn't rule on them and different compilers treat them differently.
If you don't care about this, then yes, this is another viable solution. 👍

@GavinNL
Copy link

GavinNL commented Sep 28, 2023

We have a solution and I'm fine with "wont fix:, but I think we should at least have this issue searchable in case other people have have come across the same issue.

@skypjack
Copy link
Owner

Fair enough but how do we make the issue searchable? I mean, it's so already and it contains lot of words that match with the right keywords. How can we improve this eventually?

@GavinNL
Copy link

GavinNL commented Sep 29, 2023

Instead of setting this issue as closed/wont fix, is there a way to mark it as "Known Issue" ?

If not, maybe we can put it in the documentation under a section called "Known Issues"? We can explain the error along with the solution.

@skypjack
Copy link
Owner

I don't think of a way to do that in GH but EnTT has a FAQ doc that fits the purpose maybe?

@GavinNL
Copy link

GavinNL commented Sep 29, 2023

Sure, that will work :)

@skypjack skypjack added documentation docs related issues/requests solved available upstream or in a branch and removed triage pending issue, PR or whatever labels Nov 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation docs related issues/requests solved available upstream or in a branch
Projects
None yet
Development

No branches or pull requests

5 participants