Adapting the Code - Key Considerations

This entire section assumes we are using LevelDB state storage for Fabric and not CouchDB. I need to write another article to explain why and the differences between the options, but for now you can read the docs

What are keys?

In a Key/Value database, the Key is what we use to look up the value. You can also think of it is as a "Name" or "ID".

KeyValue
Player1{"Name": "Player1", "Level": 10, "Packs": [...] }
Player2{"Name": "Player2", "Level": 27, "Packs": [...] }

In this example the value is shown as JSON, but it can be anything.

Types of Keys

In Fabric there are two types of keys:

Simple Keys

The key shown in the example above is a simple key; it is just a string. 

Simple keys can only be queried by range like pages in a dictionary: "A" to "C", or "Ab" to "Ae".

Composite Keys

Composite keys use a standard that supports basic query functions with a highly optimized data structure. It is possible to perform even more complex queries when using CouchDB for state storage, but that is not what we are using.

A composite key has two parts:

  1. Object Type
    • This single string creates partitions of keys based on the type of object
    • Example: PLAYER_PROFILE
  2. Attributes
    • This is an arbitrary number of strings that further break down the key space
    • Only strings can be used, so all non-string attributes must be stringified
    • Example: Player1

Fabric uses null characters to separate keys to ensure there are no accidental key collisions. I am using the vertical bar character | to visualize these.

KeyValue
PLAYER_PROFILE|Player1{"Name": "Player1", "Level": 10, ...}
PLAYER_PROFILE|Player2{Name: "Player2", Level: 27, ...}
PLAYER_VAULT|Player1{"Name": "Player1", "Packs": [...] }
PLAYER_VAULT|Player2{Name: "Player2", "Packs": [...] }

Composite Keys can be queried by part or all of their key. A query could be for PLAYER_VAULT|* , but you cannot skip a component in the query.

For example if we added vaults per season of the game then we may store data with keys like: 

PLAYER_VAULT|Player1|Season1
PLAYER_VAULT|Player1|Season2
PLAYER_VAULT|Player1|Season3

Then we could query with PLAYER_VAULT|Player1|* to return the vaults for a given player from all seasons. However, we cannot query with PLAYER_VAULT|*|Season1 to get all players who participated in Season 1.

You may notice that because Fabric builds these composite keys into a string with null characters it's actually the same querying capabilities as with Simple keys. As with all standards, it's there to make it easier to pick up code and go instead of re-learning or re-creating a new way to query keys for each project.

Defining Keys

The examples above show the difference between the v0.0 MVP data structure and the v0.1 Player Improvements data structure.

The way we key our data has a big impact on performance:

  • If we have few keys and large objects then we are more likely to have key collisions and limit our overall throughput.

  • If we have a key for every single piece of data, then we have to query and fetch data many times to perform a single operation - kind of like the UTXO model.

This is the typical computing space-time tradeoff. There is no perfect solution for everything, and the the ideal solution is somewhere between these two extremes.

Thought Exercise 1

Like we see in the v0.2 example, we are building a lookup of packs owned by a player in the Player Vault. An alternative to this, which is recommended in the Fabric documents, would be to key the packs by the user, like:

PACK|Player1|001
PACK|Player1|002

With this approach we can query all packs for a player like: PACK|Player1|* without needing to know the packs a player has and without needing to index them. This helps prevent key collisions against the Player's Vault because we don't need to update the vault whenever a player gains or loses a pack. On the flip side, this makes it worse for all contract calls that want to check the packs a player has because they have to do a ranged query to find them all.

Key performance issues aside, what happens when Player1 gives this pack to Player2? Do we change the key? Since these pack will also be represented by tokens on external blockchains it is not a good idea to let the key change over time.

Thought Exercise 2

Similar to the vault key concept above, perhaps our packs will also be released in seasons as well. We could key our packs this way:

PACK|Season1|0001
PACK|Season1|0002
...
PACK|Season2|0001
PACK|Season2|0002
...

This would allow us to fetch all packs with PACK|*, or packs for a specific season like PACK|Season1|*. This makes our querying capabilities more powerful and the keys more descriptive, but it duplicates more data.

Gotchas

Every key is unique, so writing to the same key twice will overwrite it with the last value. If you're not careful of this in your contract, you could create a vulnerability. In the past, the SDK we made had separate functions for Create and Update that would explicitly throw an error if we tried to Create something that already existed - the normal PutState does not do this.

Conclusion

Even though Fabric doesn't have rigid network fees like Ethereum we still must consider performance. We should focus on keying our data in a way that allows our contracts to perform well. We need to look at how the data will be used and retrieved by contracts. For commonly looked-up data, we should build a lookup to save on CPU cycles. And remember, we can always index the data off-chain to let us query for the IDs we want and then perform on-chain actions with those IDs.

The best part of using a flexible framework like Fabric is that we can easily, efficiently, and transparently adjust the key structure later if we run into serious performance issues. We can start with our best attempt and refine later on.

Comments