AI-Chatbot DeepSeek mit Ollama ausführen

Einleitung

Dieses Tutorial erklärt wie man DeepSeek (ein LLM) mithilfe von Ollama verwendet, einem KI-Management-Tool zum Ausführen von KI-Modellen. Das Tutorial bietet zudem ein paar Hintergrundinformationen über Reasoning-Modelle und erklärt, wie diese funktionieren.

Voraussetzungen

Ein Server mit Ubuntu/Debian
- Zugriff auf den root-Benutzer oder einen Benutzer mit sudo-Rechten.
- Basiseinstellungen einschließlich einer Firewall sollten bereits eingerichtet worden sein.
Mindestanforderungen für die Hardware
- CPU: Im Idealfall — nicht zwingend erforderlich — Intel/AMD CPU, die AVX512 oder DDR5 unterstützt. Zum Prüfen:
```
lscpu | grep -o 'avx512' && sudo dmidecode -t memory | grep -i "DDR5"
```
- RAM: 16GB
- Speicherplatz: ca. 50GB
- GPU: Nicht erforderlich

Das Tutorial wurde getestet auf einem Hetzner Server der GEX-Serie mit einer CUDA-fähigen GPU, um Ollama effizient auszuführen. Wir empfehlen das Verwenden einer Linux-Distribution.

Schritt 1 - Konzept und Anwendungsfälle von Reasoning-Modellen

Wie du vermutlich bereits weißt, ist DeepSeek R1 ein Reasoning-Modell — eine fortschrittliche Art von Künstlicher Intelligenz, die für Aufgaben entwickelt wurde, die logisches, analytisches und kontextuelles Verständnis erfordern. Es ahmt menschliche kognitive Prozesse nach, wie z. B. Problemlösung, Deduktion, Schlussfolgerung und Entscheidungsfindung. Im Gegensatz zu traditionellen KI-Systemen, die sich ausschließlich auf Mustererkennung oder regelbasierte Operationen stützen, integriert ein Reasoning-KI-Modell komplexe kognitive Fähigkeiten, um Kontexte zu verstehen, aus unvollständigen Informationen Schlüsse zu ziehen und logische Entscheidungen zu treffen.

Wie das Modell trainiert wird:

Ein zentraler Aspekt von Reasoning-Modellen ist die Destillation. Anstatt große Mengen an Rohinformationen direkt zu speichern, wird das Modell darauf trainiert, die Antworten leistungsstärkerer LLMs auf eine Vielzahl von Fragen und Szenarien zu reproduzieren. Dadurch kann DeepSeek präzise Antworten liefern und bleibt gleichzeitig deutlich kompakter. Das macht fortschrittliche KI zugänglicher für kleinere Entwickler und Hobbyisten.

Das "Wissen" traditioneller LLMs ist oft auf die Datensätze begrenzt, mit denen sie trainiert wurden. Reasoning-Modelle hingegen lernen aus den Ausgaben mehrerer LLMs. Durch die Destillation von Wissen aus verschiedenen Quellen, profitiert DeepSeek von einer breiteren Wissensbasis.

Szenarien, in denen das Modell eingesetzt wird:

Reasoning-Modelle verarbeiten Informationen, um logische Schlüsse zu ziehen und komplexe Probleme zu lösen. Sie sind besonders nützlich in Anwendungen, die datenbasierte Entscheidungsfindung erfordern. Wenn du dem Modell Informationen gibst, kann es daraus neue Erkenntnisse ableiten oder Probleme auf strukturierte Weise lösen.

Wie das Modell antwortet:

Wenn du eine Frage stellst, gibt das Modell eine Antwort zurück, die aus zwei Abschnitten besteht:

Abschnitt	Beschreibung
Chain of Thought	Das Modell "denkt" über das Problem nach. Es führt eine Schritt-für-Schritt-Analyse durch, legt Zwischenschritte dar und erklärt seinen Denkprozess.
Antwort	Das endgültige Ergebnis oder die Schlussfolgerung, die aus dem Denkprozess abgeleitet wurde.

Im Terminal sieht das in etwa so aus:

<think>
Okay, so I need to ...
</think>

In conclusion, you ...

Schritt 2 - Ollama installieren

Wir werden Ollama verwenden. Für einen schnellen Start, kannst du das Installations-Skript nutzen:

Wenn du Ollama manuell installieren möchtest, kannst du Schritt 1 in dem Tutorial "Einen eigenen AI-Chatbot hosten mit Ollama und Open WebUI" folgen.

curl -fsSL https://ollama.com/install.sh | sh

Damit wird die Installation auf der Maschine gestartet. Sobald die Installation abgeschlossen ist, kannst du direkt loslegen.

Schritt 3 - DeepSeek ausführen

Im Terminal kannst du nun mit Ollama ein KI-Modell starten (siehe diese Liste von verfügbaren Sprachmodellen). Im folgenden wird das open-source Reasoning-Modell DeepSeek R1 gestartet. Ollama bietet alle verfügbaren Modellgrößen (1.5b, 7b, 8b, 14b, 32b, 70b und 671b Parameter). Mit dem Hetzner Server GEX130 kannst du bis zu einem 70b-Modell mit Quantisierung ausführen. Auf dem Hetzner Server GEX44 empfehlen wir, mit dem 7b-Modell zu starten und je nach Bedarf zu skalieren.

ollama run deepseek-r1:7b

Du kannst nun anfangen Fragen zu stellen. In manchen Fällen kann es passieren, dass die Antwort unerwartet Chinesischen Text enthält. Dies ist ein bekanntes Problem. Falls dies passiert, kannst du versuchen das zu vermeiden, indem du in deiner Anfrage "Antworte auf Englisch" ergänzt.

Schritt 4 - DeepSeek verwenden

Folgend werden zwei Beispiele analysiert — eine Anfrage zur Entscheidungsfindung und eine Anfrage zu einem Code.

Beispiel zur Entscheidungsfindung

Angenommen man stellt folgende Frage:

What is better — a reasoning model or a language model?

Die Antwort von DeepSeek R1 sieht in etwa so aus:

<think>
Alright, so I’m trying to figure out whether a reasoning model is better than a language model or vice versa. First off, I need to understand what each of these actually does.
A **language model** seems pretty straightforward—it's about understanding and generating human language. So if I were to ask it to write a essay or translate something, it would do that for me. It probably learns from vast amounts of text data to get better at predicting the next word or sentence.
On the other hand, a **reasoning model** sounds more complex. From what I gather, reasoning models aren't just about language; they also process information and make logical deductions based on that information. If you gave it some facts, it could use those to infer new knowledge or solve problems in a structured way.
So comparing the two, it seems like a language model is specialized for handling linguistic tasks, while a reasoning model handles more cognitive tasks involving logic and inference. But I’m not sure if one is inherently better than the other—it probably depends on what you need for your task.

Maybe it's about knowing how they complement each other. Like in real-world scenarios, you often need both language to communicate and reasoning to understand context or make decisions based on information. So perhaps using them together provides a more comprehensive solution.
I also wonder if there are situations where one outperforms the other exclusively. For example, if I'm building an app that needs to have deep conversations with users (like chatbots), a language model would be essential for making the interactions natural and coherent. But if the app needs to understand complex data or solve intricate problems based on that data, then maybe a reasoning model would add more value.

Another thought is about their applications. Language models are widely used in areas like NLP tasks (Natural Language Processing), text generation, translation, etc., because they're so fundamental to how we communicate. Reasoning models might be more niche but crucial for specialized AI systems that need to perform logical reasoning, such as search engines, expert systems, or autonomous systems.
I also think about the advancements in each area. Language models have seen huge improvements with things like transformer architectures (e.g., BERT, GPT), making them more accurate and versatile. Reasoning models might benefit from advancements in logic-based AI or probabilistic reasoning, but they could be slower or less accessible to develop compared to language models.
Moreover, there's the aspect of integration. Combining both capabilities into a single model might offer more flexibility and robustness for complex tasks. However, this also adds complexity and computational resources, which isn't always feasible.

In summary, it seems like neither is universally better than the other. It really depends on what you're trying to achieve—whether it's handling language-based tasks with a language model or tackling cognitive reasoning tasks with a reasoning model.
</think>

In conclusion, whether a **reasoning model** or a **language model** is "better" depends entirely on the specific
task or application at hand. Here's a summary of the key points:

1. **Language Models**:   These are specialized for understanding and generating human language. They excel in tasks
                          such as text generation, translation, summarization, and conversational AI (e.g., chatbots). Advances like
                          transformer architectures (e.g., BERT, GPT) have significantly enhanced their capabilities.

2. **Reasoning Models**:  These focus on processing information to make logical deductions and solve complex
                          problems. They are useful in applications requiring decision-making based on data, 
                          such as expert systems or autonomous systems.

3. **Complementary Use**: Often, the best solutions integrate both approaches. Language models handle
                          communication, while reasoning models provide cognitive capabilities. This 
                          combination can lead to more comprehensive AI systems tailored to specific needs.

In diesem Beispiel zur Entscheidungsfindung sieht man, dass das Modell zunächst beide Optionen definiert, bevor es Vergleiche anstellt oder Entscheidungen trifft. Es untersucht deren Stärken und vermeidet voreilige Schlussfolgerungen, indem es verschiedene Perspektiven und Szenarien betrachtet.

Beispiel zum Code

Angenommen man stellt folgende Frage:

How does a simple Python script look like that returns "Hello World!".

Die Antwort von DeepSeek R1 sieht in etwa so aus:

<think>
Okay, so I need to write a simple Python script that outputs "Hello World!". Hmm, let me think about how to do this. I remember from my classes that in Python, you can print messages using the print() function. So maybe it's as simple as writing something like print("Hello World!"). But wait, is there more to it?
I guess I should make sure that the string inside the print function is exactly "Hello World!". Let me check if I need quotes around each letter or not. No, in Python, you can just put the entire string inside the quotes and pass it directly to print(). So writing print("Hello World!") should do it.

Wait, are there any other requirements? The user asked for a simple script that returns "Hello World!", so maybe they just want the code without any extra explanations or error handling. That makes sense because adding too much might complicate things when someone is just starting out.
I should also consider if I need to run it in a specific environment, but since it's Python, as long as the syntax is correct, it should work on most setups. Maybe the user doesn't need to worry about that for such a simple script.

Putting it all together, the script would be just one line: print("Hello World!"). I don't think there are any imports needed here because this is a straightforward print statement without any functions or classes being called from libraries. So that should do it.
</think>

Here's a simple Python script that outputs "Hello World!":

```python
print("Hello World!")
```

When you run this script, it will display the text "Hello World!" in the console.

In diesem Codebeispiel sieht man, dass das Modell die Antwort bereits von Anfang an kennt. Anstatt sofort zu antworten, hinterfragt es aber seine eigene Logik und prüft, ob es etwas übersehen oder einen Fehler gemacht hat. Zudem analysiert das Modell verschiedene Szenarien und Umgebungen, bevor es tatsächlich antwortet.

Ergebnis

In diesem Tutorial wurden die technischen Konzepte von DeepSeek ein wenig beleuchtet und erklärt, wie man es verwendet. Wir empfehlen ein bisschen mit den verschiedenen Parametergrößen von DeepSeek R1 herumzuspielen und auch andere verfügbare KI-Modelle auszuprobieren, die auf ollama.com/search verfügbar sind. Mit ollama run <model_name> können auch alle anderen Ollama-basierten Modelle gestartet werden.

Für weitere Informationen, wie man die Modelle in einer benutzerfreundlichen Umgebung verweden kann, empfehlen wir dieses ausführliche Tutorial zur Einrichtung von Open WebUI und KI-Modellen.

AI-Chatbot DeepSeek mit Ollama ausführen

Author

Published

Time to read

Table of Contents