Author: cek
-
Orpheus & The Compatibility Curse
(aka Why I Yell at Clouds…and TTS Libraries) So, picture this: Barcelona. Almost midnight. Me, fueled by the sheer force of will required to juggle a million things at once. And what am I doing? Not dreaming up fantastical projects or perfecting intricate algorithms…no, no. I’m wrestling with a TTS library. A Text-To-Speech library. Because…
-
From LLM Tuning to Meditations of Marcus Aurelius…
Tonight I was deep in the middle of adding new features to my LLM code and in one of my performance tests, the conversation ended up drifting into something so profound for me that I felt the need to share it from my tiny and insignificant corner… Yes, there are several famous excerpts from Marcus…
-
Lion EvoLved Sign Momentum: The New Optimizer Discovered by Google Brain
According to the authors of the paper, a suitable learning rate for Lion is typically 3-10 times lower than that used with Adam(w). Since the effective weight decay is lr * λ, the value of the decoupled weight decay λ used for Lion is 3-10 times larger than that used with Adam(w) to maintain similar strength.
The…
-
The Future of Smartphones: Betting Big on sLM
Mobile device manufacturers are optimistic about the prospects for the use of artificial intelligence (AI) in smartphones. Companies like Qualcomm and MediaTek have launched smartphone chipsets that have enough muscle for processing AI applications. Previously, many AI applications on devices were partially processed in the cloud and then offloaded to the phone. However, cloud-based models…
-
Apple Open Sources Large Models for Mobile Devices!
Apple has released an artificial intelligence (AI) model called OpenELM ( Open Efficient Language Model ), along with its code, weights, data sets, and training processes. Like Google, Samsung and Microsoft, which are focusing on developing generative AI models on both desktop and mobile devices, Apple has also joined this trend. This marks the birth of a new family…
-
OpenBioLLM-70B and 8B: Outperforms GPT-4, Gemini, Meditron-70B, Med-PaLM-1 and Med-PaLM-2 in the medical domain
The developers of this model created a custom and diverse dataset, collaborating with medical experts to ensure the highest quality. The dataset covers over 3,000 healthcare topics and 10+ medical subjects. The outstanding performance of OpenBioLLM-70B is evident on 9 diverse biomedical datasets, achieving an impressive average of 86.06% despite having fewer parameters than GPT-4…
-
LLM Tuning Fun…
Also available at my Linkedin Today I was in the middle of a tuning session where I’m timing how long it takes for my adapted LLM to respond with a metric I call “TTFW” (Time To First Word) as I’m constantly working on improving all the different bits and pieces so AiMA Beyond Ai gives…