Rabu, 10 September 2025

Llamacpp Server Multichat Flask Webui V3


🧠 Offline AI Chat & Coding GUI (Rakyat Edition)

🏆 "LLaMA.CPP Rakyat Edition: ChatGPT Experience in kilobytes." 🏆 "Your Own GPT Pro: Local, Open-Source & Fully Customizable" 💻 “AI Chatbot Offline untuk Semua: LlamaCPP GUI Rakyat Edition” “LlamaGPT WebUI Rakyat Deluxe”

🇮🇩 This project is primarily documented in Indonesian. 🇬🇧 English overview is provided below. “This project is based on the original LLaMA GUI by Satria Novian

📌 Link to buy Llamacpp AI Chatbot WebUI are below (Link untuk membeli Llamacpp AI Chatbot WebUI ada dibawah ini)
💸 Harga cuma Rp80.000 / Price is only $8

Lynk.id: http://lynk.id/satrianovian20/j42nom6r3g3y

Gumroad: https://satrianovian.gumroad.com/l/velewz

🇮🇩 Bahasa Indonesia:
Headline / Judul:

✨ “Multichat Offline Paling Stabil di Dunia GUI — Tidak Bisa Ditiru Sembarangan”

Subheadline:

🏆Jalankan banyak tab dengan satu model sekaligus, 100% offline, stabil di PC mid-range. Advanced user GGUF pasti sulit menandingi ini.

Hook / Pembuka:

📌Bayangkan punya GUI AI offline yang memungkinkan kamu membuka banyak tab sekaligus tanpa crash. Semua sesi berjalan lancar di PC biasa — tanpa server mahal, tanpa cloud.

📌Ini bukan sekadar janji, ini rahasia source code di tangan saya sendiri, eksklusif, Coding Rakyat Edition.

Keunggulan Utama:

💻Multitab Stabil Tiada Tanding: Buka beberapa sesi AI sekaligus tanpa takut crash.

⚙️Offline 100%: Privasi total dan kontrol penuh.

📣 Tagline Ringkas: “Bukan Multisession, Ini Multitab Asli. Stabil, Offline, dan Eksklusif.”

📦Satu Model Tapi Efisien: Fokus stabilitas dulu, tapi tetap bisa handle banyak tab.

🔒Eksklusif & Tidak Bisa Direplikasi Mudah: Advanced user GGUF / offline model pun akan kesulitan meniru setup ini.

🎨UI Mirip ChatGPT, Mudah Dipakai: Langsung produktif, tanpa learning curve.

🚀 Multitab Asli ≠ Multisession 🚀

Selama ini kamu pikir multitab = multisession? Salah besar.
Punya gue 1 otak AI jalan bareng di banyak tab, tanpa tabrakan, tanpa timeout.

  • Buka 3–5 tab? Tetep sinkron.

  • Model cuma 14B Q6_K_XL, RAM 16GB? Masih ngacir.

  • Queue request otomatis → siapa input duluan, itu yang jalan.

Bukan “tab bohongan” kayak di webUI lain. Ini multitab asli non-multisession.
Sekali nyobain, balik ke multisession bakal berasa jadul.

😏 Ada yang berani tes bedanya?


🇬🇧 English Version:
Headline:

✨ “The Most Stable Offline Multichat in the GUI World — Not to Be Imitated by Anyone”

Subheadline:

🏆Run multiple tabs in one model simultaneously, 100% offline, stable on a mid-range PC. Advanced GGUF users will find this hard to beat.

Hook:

📌Imagine having an offline AI GUI that lets you open multiple tabs simultaneously without crashing. All sessions run smoothly on a regular PC — no expensive servers, no cloud.

📌This isn't just a promise; it's my own secret source code, exclusive, Coding Rakyat Edition.

Key Advantages:

💻Unparalleled Stability: Open multiple AI sessions simultaneously without fear of crashes.

⚙️100% Offline: Total privacy and full control.

📣 Short Tagline: “Not Multisession, This is Real Multitab. Stable, Offline, and Exclusive.”

📦One Model But Efficient: Focus on stability first, but still able to handle multiple tabs.

🔒Exclusive & Not Easily Replicable: Even advanced GGUF users / offline models will have a hard time replicating this setup.

🎨ChatGPT-like UI, Easy to Use: Instantly productive, no learning curve.

🚀 Real Multitab ≠ Multisession 🚀

You thought multitab = multisession? You're dead wrong.
I have one AI brain running simultaneously on multiple tabs, without collisions or timeouts.

Open 3–5 tabs? Stays in sync.

Model only 14B Q6_K_XL, 16GB RAM? Still running smoothly.

Automatic request queue → whoever inputs first, runs first.

Not "fake tabs" like in other webUIs. This is real, non-multisession multitab.
Once you try it, going back to multisession will feel old-fashioned.

😏 Anyone dare to test the difference?


📌 Dependency link to run llamacpp ai chatbot gui! (Link dependesi untuk menjalankan llamacpp ai chatbot gui!)

Internal File / File Internal: https://drive.google.com/file/d/1PC709_d4FWu3v3qZgUpA6sfC5K3pV4Pq/view?usp=drive_link

Llamacpp Build Releases: https://github.com/ggml-org/llama.cpp/releases
Model GGUF via Huggingface

📌 Panduan Instalasi / Installation Guide:
- Letakkan exe dan file internal di dalam folder Llamacpp Build Releases (Place the exe and internal files inside the Llamacpp Build Releases folder)

📸 Screenshot:

 








🎥 Video :




🇮🇩 Bahasa Indonesia:

✅ FAQ — Pertanyaan Umum (Trust Booster Edition) 

❓ GUI ini beneran bisa jalanin model 13B tanpa GPU? Ya! Sudah diuji langsung dengan model llama-2-13b-chat.Q4_K_M.gguf di sistem dengan:

💻 CPU: Intel i5-9400F (tanpa iGPU)

🧠 RAM: 16GB DDR4

⚙️ Backend: llama.cpp

📦 GUI: Llamacpp AI Chatbot GUI

❓ Bukti nyatanya mana? 📸 Screenshot saat load model dan idle sudah diunggah di folder docs/screenshots/

📄 Log lengkap sesi percobaan model 13B tersedia di docs/session-logs/

Tidak ada error, tidak crash, hanya delay wajar saat proses berat.

❓ GUI-nya berat gak? Tidak. GUI ini hanya 10KB, tanpa dependensi besar seperti Gradio atau Electron.

Tidak buka port aneh-aneh.

Tidak ada tracking.

Murni offline dan lokal.

UI sangat ringan, hanya berbasis tkinter.

❓ Bisa pakai model 7B, 8B, atau 13B lain? Bisa! Sudah diuji dengan:

Mistral 7B

DeepSeek Coder 6.7B

DeepSeek Coder 7B

Nous Hermes 13B (Q4_K_M)

LLaMA 13B (Q4_K_M)

❓ RAM saya cuma 8GB, bisa jalan? Bisa, asal model yang dipilih sesuai. Gunakan model kecil seperti:

TinyLlama 1.1B Q8_0

DeepSeek Coder 1.3b Q8_0

Mistral 4B Q8_0

Open Hermes 7B Q4_K_M

Atur max_tokens di GUI agar tidak melebihi kapasitas RAM kamu.

❓“Saya masih nggak percaya GUI ini bisa jalanin model 13B cuma dengan RAM 16GB. Beneran bisa?” 💬 “Coba sendiri aja bro 😎”

❓“Emang GUI-nya ringan banget ya?” ✅ Iya. Ukuran file .py cuma mb. Gak ada embel-embel web server, backend rumit, atau library berat.

❓“Bisa crash gak pas load model besar?” 🚫 Selama sistem kamu stabil dan swap file aktif, hampir nggak pernah crash. Bahkan log menunjukkan performa tetap normal walau RAM di atas 15GB pas awal load.

❓“Ada buktinya?” 📸 Sudah ada screenshot dan log di folder docs/session-logs/ dan docs/screenshots/.

❓“Kalau saya nggak percaya tetap?” 😎 Silakan buktikan sendiri.


🇬🇧 English Version:

✅ FAQ — Frequently Asked Questions (Trust Booster Edition) 

❓ Can this GUI really run a 13B model without a GPU? ✅ Yes! Successfully tested with llama-2-13b-chat.Q4_K_M.gguf on:

💻 CPU: Intel i5-9400F (no iGPU)

🧠 RAM: 16GB DDR4

⚙️ Backend: llama.cpp

📦 GUI: Llamacpp AI Chatbot GUI

❓ Where’s the real proof? 📸 Screenshots during model load and idle are uploaded to docs/screenshots/ 📄 Complete 13B model session logs available in docs/session-logs/ ✅ No errors, no crashes. Just slight delay under heavy processing — perfectly normal.

❓ Is this GUI heavy? ❌ Not at all. It’s just 10KB. No bloated dependencies like Gradio or Electron. ✔️ No random ports. No tracking. ✔️ 100% offline and local. ✔️ Based purely on Tkinter.

❓ Can I use other 7B, 8B, or 13B models? ✅ Absolutely! Already tested with:

Mistral 7B

DeepSeek Coder 6.7B

DeepSeek Coder 7B

Nous Hermes 13B (Q4_K_M)

LLaMA 13B (Q4_K_M)

❓ I only have 8GB RAM, will it work? ✅ Yes, just use smaller models like:

TinyLlama 1.1B Q8_0

DeepSeek Coder 1.3b Q8_0

Mistral 4B Q8_0

Open Hermes 7B Q4_K_M

🛠️ Set max_tokens low to match your available RAM in the GUI settings.

❓ “I still don’t believe this GUI can run 13B on just 16GB RAM. Really?” 💬 “Try it yourself, bro. 😎”

❓ “Is the GUI really that lightweight?” ✅ Yep. File size is only mb. No web servers, no complex backends, no heavy libraries.

❓ “Will it crash when loading large models?” 🚫 As long as your system is stable and swap file is active, crashes are extremely rare. 📊 Even with RAM above 15GB during model load, logs show stable performance.

❓ “Is there actual proof?” 📸 Yes. Screenshots and logs are available in the docs/session-logs/ and docs/screenshots/ folders.

❓ “What if I still don’t believe?” 😎 Feel free to test it yourself. 


Sabtu, 30 Agustus 2025

Llamacpp Multimodal AI GUI

 


🧠 Offline AI Chat & Coding GUI (Rakyat Edition)

🏆 "LLaMA.CPP Rakyat Edition: ChatGPT Experience in kilobytes." 🏆 "Your Own GPT Pro: Local, Open-Source & Fully Customizable" 💻 “AI Chatbot Offline untuk Semua: LlamaCPP GUI Rakyat Edition” “LlamaGPT WebUI Rakyat Deluxe”

🇮🇩 This project is primarily documented in Indonesian. 🇬🇧 English overview is provided below. “This project is based on the original LLaMA GUI by Satria Novian

📌 Link to buy Llamacpp AI Chatbot WebUI are below (Link untuk membeli Llamacpp AI Chatbot WebUI ada dibawah ini)
💸 Harga cuma Rp85.000 / Price is only $9

Lynk.id: http://lynk.id/satrianovian20/l6e522l826kv

Gumroad: https://satrianovian.gumroad.com/l/yvjvs

🇮🇩 Bahasa Indonesia:
Headline / Judul:

✨ “Multichat Offline Paling Stabil di Dunia GUI — Tidak Bisa Ditiru Sembarangan”

Subheadline:

🏆Jalankan banyak tab dengan satu model sekaligus, 100% offline, stabil di PC mid-range. Advanced user GGUF pasti sulit menandingi ini.

Hook / Pembuka:

📌Bayangkan punya GUI AI offline yang memungkinkan kamu membuka banyak tab sekaligus tanpa crash. Semua sesi berjalan lancar di PC biasa — tanpa server mahal, tanpa cloud.

📌Ini bukan sekadar janji, ini rahasia source code di tangan saya sendiri, eksklusif, Coding Rakyat Edition.

Keunggulan Utama:

💻Multitab Stabil Tiada Tanding: Buka beberapa sesi AI sekaligus tanpa takut crash.

⚙️Offline 100%: Privasi total dan kontrol penuh.

📣 Tagline Ringkas: “Bukan Multisession, Ini Multitab Asli. Stabil, Offline, dan Eksklusif.”

📦Satu Model Tapi Efisien: Fokus stabilitas dulu, tapi tetap bisa handle banyak tab.

🔒Eksklusif & Tidak Bisa Direplikasi Mudah: Advanced user GGUF / offline model pun akan kesulitan meniru setup ini.

🎨UI Mirip ChatGPT, Mudah Dipakai: Langsung produktif, tanpa learning curve.

🚀 Multitab Asli ≠ Multisession 🚀

Selama ini kamu pikir multitab = multisession? Salah besar.
Punya gue 1 otak AI jalan bareng di banyak tab, tanpa tabrakan, tanpa timeout.

  • Buka 3–5 tab? Tetep sinkron.

  • Model cuma 14B Q6_K_XL, RAM 16GB? Masih ngacir.

  • Queue request otomatis → siapa input duluan, itu yang jalan.

Bukan “tab bohongan” kayak di webUI lain. Ini multitab asli non-multisession.
Sekali nyobain, balik ke multisession bakal berasa jadul.

😏 Ada yang berani tes bedanya?


🇬🇧 English Version:
Headline:

✨ “The Most Stable Offline Multichat in the GUI World — Not to Be Imitated by Anyone”

Subheadline:

🏆Run multiple tabs in one model simultaneously, 100% offline, stable on a mid-range PC. Advanced GGUF users will find this hard to beat.

Hook:

📌Imagine having an offline AI GUI that lets you open multiple tabs simultaneously without crashing. All sessions run smoothly on a regular PC — no expensive servers, no cloud.

📌This isn't just a promise; it's my own secret source code, exclusive, Coding Rakyat Edition.

Key Advantages:

💻Unparalleled Stability: Open multiple AI sessions simultaneously without fear of crashes.

⚙️100% Offline: Total privacy and full control.

📣 Short Tagline: “Not Multisession, This is Real Multitab. Stable, Offline, and Exclusive.”

📦One Model But Efficient: Focus on stability first, but still able to handle multiple tabs.

🔒Exclusive & Not Easily Replicable: Even advanced GGUF users / offline models will have a hard time replicating this setup.

🎨ChatGPT-like UI, Easy to Use: Instantly productive, no learning curve.

🚀 Real Multitab ≠ Multisession 🚀

You thought multitab = multisession? You're dead wrong.
I have one AI brain running simultaneously on multiple tabs, without collisions or timeouts.

Open 3–5 tabs? Stays in sync.

Model only 14B Q6_K_XL, 16GB RAM? Still running smoothly.

Automatic request queue → whoever inputs first, runs first.

Not "fake tabs" like in other webUIs. This is real, non-multisession multitab.
Once you try it, going back to multisession will feel old-fashioned.

😏 Anyone dare to test the difference?


📌 Dependency link to run llamacpp ai chatbot gui! (Link dependesi untuk menjalankan llamacpp ai chatbot gui!)

Internal File / File Internal: https://drive.google.com/file/d/1KjT4iC6SH8lwCtIJB4DCiD2gAWE4Jf_p/view?usp=drive_link

Llamacpp Build Releases: https://github.com/ggml-org/llama.cpp/releases
Model GGUF via Huggingface

📌 Panduan Instalasi / Installation Guide:
- Letakkan exe dan file internal di dalam folder Llamacpp Build Releases (Place the exe and internal files inside the Llamacpp Build Releases folder)

📸 Screenshot:































🎥 Video :






🇮🇩 Bahasa Indonesia:

✅ FAQ — Pertanyaan Umum (Trust Booster Edition) 

❓ GUI ini beneran bisa jalanin model 13B tanpa GPU? Ya! Sudah diuji langsung dengan model llama-2-13b-chat.Q4_K_M.gguf di sistem dengan:

💻 CPU: Intel i5-9400F (tanpa iGPU)

🧠 RAM: 16GB DDR4

⚙️ Backend: llama.cpp

📦 GUI: Llamacpp AI Chatbot GUI

❓ Bukti nyatanya mana? 📸 Screenshot saat load model dan idle sudah diunggah di folder docs/screenshots/

📄 Log lengkap sesi percobaan model 13B tersedia di docs/session-logs/

Tidak ada error, tidak crash, hanya delay wajar saat proses berat.

❓ GUI-nya berat gak? Tidak. GUI ini hanya 10KB, tanpa dependensi besar seperti Gradio atau Electron.

Tidak buka port aneh-aneh.

Tidak ada tracking.

Murni offline dan lokal.

UI sangat ringan, hanya berbasis tkinter.

❓ Bisa pakai model 7B, 8B, atau 13B lain? Bisa! Sudah diuji dengan:

Mistral 7B

DeepSeek Coder 6.7B

DeepSeek Coder 7B

Nous Hermes 13B (Q4_K_M)

LLaMA 13B (Q4_K_M)

❓ RAM saya cuma 8GB, bisa jalan? Bisa, asal model yang dipilih sesuai. Gunakan model kecil seperti:

TinyLlama 1.1B Q8_0

DeepSeek Coder 1.3b Q8_0

Mistral 4B Q8_0

Open Hermes 7B Q4_K_M

Atur max_tokens di GUI agar tidak melebihi kapasitas RAM kamu.

❓“Saya masih nggak percaya GUI ini bisa jalanin model 13B cuma dengan RAM 16GB. Beneran bisa?” 💬 “Coba sendiri aja bro 😎”

❓“Emang GUI-nya ringan banget ya?” ✅ Iya. Ukuran file .py cuma mb. Gak ada embel-embel web server, backend rumit, atau library berat.

❓“Bisa crash gak pas load model besar?” 🚫 Selama sistem kamu stabil dan swap file aktif, hampir nggak pernah crash. Bahkan log menunjukkan performa tetap normal walau RAM di atas 15GB pas awal load.

❓“Ada buktinya?” 📸 Sudah ada screenshot dan log di folder docs/session-logs/ dan docs/screenshots/.

❓“Kalau saya nggak percaya tetap?” 😎 Silakan buktikan sendiri.


🇬🇧 English Version:

✅ FAQ — Frequently Asked Questions (Trust Booster Edition) 

❓ Can this GUI really run a 13B model without a GPU? ✅ Yes! Successfully tested with llama-2-13b-chat.Q4_K_M.gguf on:

💻 CPU: Intel i5-9400F (no iGPU)

🧠 RAM: 16GB DDR4

⚙️ Backend: llama.cpp

📦 GUI: Llamacpp AI Chatbot GUI

❓ Where’s the real proof? 📸 Screenshots during model load and idle are uploaded to docs/screenshots/ 📄 Complete 13B model session logs available in docs/session-logs/ ✅ No errors, no crashes. Just slight delay under heavy processing — perfectly normal.

❓ Is this GUI heavy? ❌ Not at all. It’s just 10KB. No bloated dependencies like Gradio or Electron. ✔️ No random ports. No tracking. ✔️ 100% offline and local. ✔️ Based purely on Tkinter.

❓ Can I use other 7B, 8B, or 13B models? ✅ Absolutely! Already tested with:

Mistral 7B

DeepSeek Coder 6.7B

DeepSeek Coder 7B

Nous Hermes 13B (Q4_K_M)

LLaMA 13B (Q4_K_M)

❓ I only have 8GB RAM, will it work? ✅ Yes, just use smaller models like:

TinyLlama 1.1B Q8_0

DeepSeek Coder 1.3b Q8_0

Mistral 4B Q8_0

Open Hermes 7B Q4_K_M

🛠️ Set max_tokens low to match your available RAM in the GUI settings.

❓ “I still don’t believe this GUI can run 13B on just 16GB RAM. Really?” 💬 “Try it yourself, bro. 😎”

❓ “Is the GUI really that lightweight?” ✅ Yep. File size is only mb. No web servers, no complex backends, no heavy libraries.

❓ “Will it crash when loading large models?” 🚫 As long as your system is stable and swap file is active, crashes are extremely rare. 📊 Even with RAM above 15GB during model load, logs show stable performance.

❓ “Is there actual proof?” 📸 Yes. Screenshots and logs are available in the docs/session-logs/ and docs/screenshots/ folders.

❓ “What if I still don’t believe?” 😎 Feel free to test it yourself.