@FauzulHakeem wah ko bisa jalan di 12gb mas? saya kira 35b itu minimal 24gb, boleh share flag yang dipakai apa mas? sebagai informasi, saya pake flag tambahan ini mas:
-ngl 99 -c 131072 -np 1 -fa on --cache-type-k q4_0 --cache-type-v q4_0 --host 0.0.0.0 --port 8080
I successfully chat with Qwen3.5 directly on the host, but always failed to make connection when input it in the hermes model setup.
Just starting the local LLM journey just 3 days ago with building llama.ccp CUDA from source that took 2 days for me, but i persist lol.
The Dana octopus squid (Taningia danae)
On two of its arms are giant photophores, the largest light-producing organs in any known animal
https://t.co/eCCRlVG5oA