This week, Sessions will be more fleshed out, and I wanted to take the time to explain how Mogu sessions will work, and what it means for the storage (and security) of user data.
This is going to be lengthy and at least a little convoluted, but probably at least a little informative if you’re interested in the Mogu project:
1) What Dynamic Nodes Do
Dynamic nodes are flagged in the “type” field of a ndoe definition, by adding a pipe character, and the word ‘dynamic’, like so: ”type” : “{text|dynamic}”
This flag lets Mogu know that you either plan on retrieving or storing information, and it looks for a storage policy to determine what to do with it. Static node content is all stored in an unencrypted namespace, as you well know. Dynamic node content can be stored in any number of places, can be encrypted or not, and may or may not be writable by the user. Any place where dynamic content may be stored is called a “Session”.
2) What a Session Is
Mogu sessions were inspired by Git, of all things. The theory behind them is that the user’s entire input history is stored. Much like everything else in Mogu, Sessions are all atomized, and Mogu manages what amount to a linked list of user sessions. By default, all users are operating within Mogu’s global session. When a user “logs in”, a new session is being created for them, and the entire environment is now operating under that session. Any writes to the database will be written to that session.
3) Reading Data From a Session
When Mogu is instructed to retrieve dynamic data, it first checks the current session to see if the data is stored there. If not, it checks the previous session in the linked list, and so on, and so forth, until it finds that data. By default, the first node of the linked list of sessions is the global session, so default entries can be stored there.
3b) That Seems Like a Terrible Way to Do Things! Won’t that Be Incredibly Slow?
a) Using Redis, it is generally a fast operation to look up a known piece of information, meaning that it’s very quick to determine the ID of the previous session, and then test to see whether a specific key exists. All of the required information will be at Mogu’s fingertips, so the lookups can happen quickly.
b) Note how earlier, I mentioned that all WRITES were stored in the current session. This means that if a piece of information is updated, its link in the chain was moved up any number of notches, meaning more frequently written information will be quicker to access.
c) Mogu will include a “merge” functionality, which can be run manually or by way of a cron job, that will allow you to condense a user’s data into a single session and reset their linked list, storing only the most updated information.
d) Therefore, the maximum lookup time of information is X(Y), where X is the amount of time it takes to retrieve a session ID and send the “exists” command to Redis regarding the node in question, and Y is the number of sessions in the user’s history.
4) Writing Data to a Session
As has already been mentioned, all new data is written to the current session. That is the only place data can possibly be written. This means that unless you do a session merge, no information is actually “replaced”. If there is ever a need to look up previous information, you’ll be able to do that.
5) Security in the Form of Redundancies
One repercussion of this method of user management is that if something happens to a user’s current session somehow, you’ll be able to roll them back to their previous session with minimal loss of data. All that will have been lost will have been whatever writes that have occurred to the database since they logged in.
This also means that with a good backup regimen, you’ll be able to maintain users’ complete history. For instance, by doing a weekly hard backup, followed by a session merge, you’ll have all of your users’ complete data entry history stored on your disk, but only have a week’s work of key:value pairs stored in RAM on Redis.
6) Security in the Form of Constantly Moving Information
Your users’ names and passwords aren’t just chilling in a single table on your database. In fact, they aren’t even stored in the same place. There is a node that equates a hashed, encrypted username with its most recent session. There is a node that equates a hashed, encrypted password with its most recent auth token. There is a single field in each session that stores its auth token.
Every time a user logs in, the username/most recent session table updates. A new auth token is generated and associated with the most recent session, and the user’s password is associated with that auth token. Previous auth tokens are deleted entirely.
Therefore, user information isn’t just sitting someplace collecting dust, waiting for hackers to break it. It changes every time a user logs in.
7) Security in the Form of Obfuscation
And of course, with the exception of the node templates you edit and import, no dynamic content is stored without being obfuscated. Even if it’s not encrypted, there is not a table that says “global.username_lookup”. You can of course edit the source code of your Mogu build do further increase security by salting these obfuscations, so that if one Mogu site ever gets hacked, yours can’t be hacked the same way.
8) Analytics Just got Even Better
Lastly, a great thing about this method of session management is that you could assemble a utility to piece together all sorts of crazy information throughout your application’s entire existence, broken down by user, by event, by node, or…well, just about any way you want to split it up, and know that with a properly implemented backup and merge schedule, it won’t inhibit the overall performance of your user experience.
With Mogu, everything is trackable, everything is atomized, and everything is saved. Everything is obfuscated, everything can be encrypted (EASILY), integrity is constantly tested throughout the user’s interactions, and, of course, it’s still goes real fast.
Friday’s build will bring this functionality to light.
-tom
