public:: true
- LlamaCpp is an opensource project, with the goal of making LLM to run in consumer machines
- One way to do this is by Quantitazation of the models →
- The idea is to replace the type of data in the model to another type that uses less space.
- Normal models uses 32bits and we could quantitize them to use 16/8 or even 4 bits
- Of course there is a loss of precission but we can load them in memory RAM
- One way to do this is by Quantitazation of the models →