To prettify symbols in c-mode using
prettify-symbols-alist, you can add a hook in your Emacs
configuration file (e.g., init.el or .emacs).
Here's an example:
This hook will enable prettify-symbols-mode in
c-mode and replace common C operators with prettier
symbols. Adjust the prettify-symbols-alist as needed for
your preferences.
The LLM model is to big to put into the device, if there any batter
architecture to make the model small enough to put to the device and
keep the precision and inference performance.
The LLM model embedding the word to very high dimension, in order to
do better classification. But it will affected by the curse
of dimensionality. So we need deep neural net work(more parameters)
and more training data to avoid overfitting.
But each time put some words into the space, may be adding one new
dimension only useful for related words, and for most other words will
add one no useful dimension. And the whole vector space is sparse and
need more training data and complex model(more parameters).
We can use the idea of mixture of experts, they form one graph, and
some experts are independent, but some has relations between each other,
can be connected by edges. So the problem becomes make each experts'
dimension as small as possible, and the duplication features in
different experts as small as possible. We convert one high dimension
sparse LLM to many row dimension dense experts. Just like the symbol
table design in compiler, we can use one big flatten symbol table, but
the scope level and scope name in each scope
are same and can be eliminated, so traditional implementation of the
symbol table use the chain of scopes hash table, make a more compact
memory.
Design
A graph to represent the
model, is all
experts, and is the edge
of the two experts.
Partitioning
Take the compute system as the whole world. We can split experts
to
operating system
database
compiler
programming language
distributed
network
micro architecture
application
electronic
common knowledge
Once a word file system coming, will be both added to
operating system and databse, and if a word
computer arrived, it will be put to the
common knowledge. And all experts depend on the
common knowledge.
Evolve
All the experts can be version controlled, and can evolve,
refactoring. Like add more words, add dimension, remove words to other
experts, extract more experts. Just like the package manage system, all
experts are packages.
Query(inference)
The query convert to tokens, and send to the related experts,
exploit method use dependent experts to combine the
results, exploare method to query other experts to find
results too. The results contian all the experts, the sub graph forms a
path. The path can be cached.
Online learning
Support real time input sentences. The tokenizer first to
classification, and put to different experts, batch will trigger insert.
Insert data to the experts, and update the path.