The usual way to store data in MATLAB is with the save command. But if you have data with many different parameters,
querying .mat files becomes cumbersome. What actually works, even with complex queries, is a database and a query language like
SQL (structured query language).

SQL is a declarative language, meaning you specify what data you want, not how you want to get it. That means SQL takes
care of all look-up algorithms. We first create a database from the command line:

Suppose we want to store a tensor at a certain temperature, bond dimension chi and
convergence. We create a table as follows:

The tensor field is a BLOB, meaning it’s raw bytes – we’ll have to write a function to serialize tensors from MATLAB in
this way. The other fields are NUMERIC, which does not require conversion from MATLAB numbers. The next step is adding data and querying the database, but this has to be
done from MATLAB. (In principle, the CREATE TABLE statement can be done from MATLAB too, but it is good to be familiar
with the sqlite interpreter.) Use .exit to exit the interpreter.

I use the package matlab-sqlite3-driver for interfacing with SQL from MATLAB. Suppose we want to store this data:

First, we have to serialize tensor. That can be done as follows:

Now, we store the data by accessing the database and constructing a SQL query:

At a later time, we can query the database to obtain a struct with our data.

With

After the initial learning curve, it’s very easy to construct queries like:

which selects the tensor with matching temperature and chi, but with the smallest available convergence bigger or equal to the one specified.

A final important issue is query optimalization. The above query does a full table scan to get its result, which quickly gets painful. This \( \mathcal{O}(N) \) look-up time can be reduced to \( \mathcal{O}(k \log N) \), where \( N \) is the number of rows and \( k \) the number of matching convergences, by adding an index:

On my machine, this reduced the running time of 500 queries from a table with \( 10^5 \) rows of random data from 12 seconds to 0.2 seconds.