Invoke an LLM using LangChain4J - PART 1: Container Image

In this guide, we will extend the example created in WildFly Java Microservice - PART 1: Container Image and consume an external LLM APIs through LangChain4J.

Prerequisites

To complete this guide, you need:

LLM

You can use any LLM supported by LangChain4J; in this guide we will use model smollm2 and run it using the Ollama Docker Image;

Ollama + smollm2

Start Ollama:

podman network create demo-network

podman volume create ollama
podman run --rm --network=demo-network -d -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama
Note
we started the container with the --rm flag: this way it is disposed of automatically when we stop it
Note
We created the demo-network network and started the ollama container with the --network=demo-network option: later in this guide, this will allow us to connect to the ollama container from the my-jaxrs-app-llm container

Install the smollm2 LLM inside the ollama container:

podman exec -it ollama ollama pull smollm2

Test that your LLM is working by invoking its APIs like in the following:

$ curl http://localhost:11434/api/generate -d '{ "model": "smollm2", "prompt":"Hi! My name is Tommaso"}'
{"model":"llama3.1:8b","created_at":"2025-03-19T16:26:19.388567244Z","response":"C","done":false}
{"model":"llama3.1:8b","created_at":"2025-03-19T16:26:19.518556766Z","response":"iao","done":false}
{"model":"llama3.1:8b","created_at":"2025-03-19T16:26:19.646976123Z","response":" Tom","done":false}
{"model":"llama3.1:8b","created_at":"2025-03-19T16:26:19.77417658Z","response":"mas","done":false}
{"model":"llama3.1:8b","created_at":"2025-03-19T16:26:19.901190847Z","response":"o","done":false}
{"model":"llama3.1:8b","created_at":"2025-03-19T16:26:20.033013914Z","response":"!","done":false}
…​

Maven Project

You will extend the sample application you created in WildFly Java Microservice - PART 1: Container Image by adding the wildfly-ai-feature-pack.

  • add the necessary LangChain4J modules to the server (= the necessary dependencies to work with the model of choice)

  • add the configuration related to the model of choice; e.g. when using the <layer>ollama-chat-model</layer>, the feature pack would configure the server to connect to an external Ollama service

pom.xml

dependencies

Add the following dependencies to the pom-xml file dependencies section:

        <dependency>
            <groupId>dev.langchain4j</groupId>
            <artifactId>langchain4j</artifactId>
            <version>1.0.1</version>
            <scope>provided</scope>
        </dependency>
        <dependency>
            <groupId>dev.langchain4j</groupId>
            <artifactId>langchain4j-core</artifactId>
            <version>1.0.1</version>
            <scope>provided</scope>
        </dependency>
Note
we are not adding LLM specific LangChain4J dependencies because they will be added automatically to the server by the wildfly-ai-feature-pack based on the specific AI layer we are using (e.g. the ollama-chat-model layer, will add the dev.langchain4j:langchain4j-ollama module to the server)

wildfly-maven-plugin

Add the wildfly-ai-feature-pack feature-pack and the ollama-chat-model layer to the wildfly-maven-plugin configuration.

You should end up with the wildfly-maven-plugin configured like in the following:

    <plugin>
        <groupId>org.wildfly.plugins</groupId>
        <artifactId>wildfly-maven-plugin</artifactId>
        <version>5.1.3.Final</version>
        <configuration>
            <feature-packs>
                <feature-pack>
                    <location>org.wildfly:wildfly-galleon-pack:36.0.1.Final</location>
                </feature-pack>
                <feature-pack>
                    <location>org.wildfly.cloud:wildfly-cloud-galleon-pack:8.0.0.Final</location>
                </feature-pack>
                <feature-pack>
                    <location>org.wildfly:wildfly-ai-feature-pack:0.6.0</location>
                </feature-pack>
            </feature-packs>
            <layers>
                <layer>cloud-server</layer>
                <layer>ollama-chat-model</layer>
            </layers>
        </configuration>
        <executions>
            <execution>
                <goals>
                    <goal>package</goal>
                </goals>
            </execution>
        </executions>
    </plugin>
Note
The wildfly-maven-plugin configuration can be simplified by using wildfly-glow; wildfly-glow inspects your deployment and figures out what feature-pack and layers to use automatically! just replace the whole configuration section with the following:
<configuration>
    <discoverProvisioningInfo>
        <context>cloud</context>
        <spaces>
            <space>incubating</space>
        </spaces>
    </discoverProvisioningInfo>
</configuration>

Java Classes

Replace the content of the org.wildfly.examples.GettingStartedEndpoint class with the following:

org.wildfly.examples.GettingStartedEndpoint :
package org.wildfly.examples;

import dev.langchain4j.data.message.AiMessage;
import dev.langchain4j.data.message.UserMessage;
import dev.langchain4j.memory.ChatMemory;
import dev.langchain4j.memory.chat.MessageWindowChatMemory;
import dev.langchain4j.model.chat.ChatLanguageModel;
import jakarta.enterprise.context.RequestScoped;
import jakarta.inject.Inject;
import jakarta.inject.Named;
import jakarta.ws.rs.GET;
import jakarta.ws.rs.Path;
import jakarta.ws.rs.PathParam;
import jakarta.ws.rs.Produces;
import jakarta.ws.rs.core.MediaType;
import jakarta.ws.rs.core.Response;

@Path("/")
@RequestScoped
public class GettingStartedEndpoint {
    @Inject
    @Named(value = "ollama")
    ChatLanguageModel model;

    @GET
    @Path("/{name}")
    @Produces(MediaType.TEXT_PLAIN)
    public Response sayHello(final @PathParam("name") String name) {
        ChatMemory memory = MessageWindowChatMemory.withMaxMessages(5);
        UserMessage message1 = UserMessage.from("Hi! my name is " + name);
        memory.add(message1);
        AiMessage response1 = model.chat(memory.messages()).aiMessage();
        memory.add(response1);
        return Response.ok(response1).build();
    }
}

Delete class org.wildfly.examples.GettingStartedService which isn’t used anymore at this point, since the LLM is now responsible for greeting us!

Build the application

$ mvn clean package
...
[INFO] Copy deployment /home/tborgato/projects/guides/get-started-microservices-on-kubernetes/simple-microservice-llm/target/ROOT.war to /home/tborgato/projects/guides/get-started-microservices-on-kubernetes/simple-microservice-llm/target/server/standalone/deployments/ROOT.war
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time:  6.694 s
[INFO] Finished at: 2025-03-19T14:39:09+01:00
[INFO] ------------------------------------------------------------------------

Docker Image

Build the Docker Image

Build the Docker Image with the following command:

$ podman build -t my-jaxrs-app-llm:latest .
STEP 1/3: FROM quay.io/wildfly/wildfly-runtime:latest
STEP 2/3: COPY --chown=jboss:root target/server $JBOSS_HOME
-→ 026526b27879
STEP 3/3: RUN chmod -R ug+rwX $JBOSS_HOME
COMMIT my-jaxrs-app-llm:latest
-→ 1cae487d4086
Successfully tagged localhost/my-jaxrs-app-llm:latest
1cae487d408603eedebdc5f7d116ce70a4bfa5c1d44d8eeca890645973039899
Note
You can use wildfly-maven-plugin to automate the image build

Run the Docker Image

Note that, when running the my-jaxrs-app-llm:latest Docker Image, we specify some environment variables used by WildFly to connect to the Ollama service:

podman run --rm --network=demo-network -p 8080:8080 -p 9990:9990 \
    -e OLLAMA_CHAT_URL=http://ollama:11434 \
    -e OLLAMA_CHAT_LOG_REQUEST=true \
    -e OLLAMA_CHAT_LOG_RESPONSE=true \
    -e OLLAMA_CHAT_TEMPERATURE=0.9 \
    -e OLLAMA_CHAT_MODEL_NAME=smollm2 \
    --name=my-jaxrs-app-llm \
    my-jaxrs-app-llm:latest
Note
We started the my-jaxrs-app-llm container with the --network=demo-network option just like we did when we started the ollama container: the two containers now run in the same demo-network network and we can connect to the ollama container from the my-jaxrs-app-llm container using the ollama DNS name;

Check the application

Put the http://localhost:8080/api/tom URL in your browser and you should receive a response like:

AiMessage { text = "Nice to meet you, Tom! I'm happy to chat with you. What's on your mind today?" toolExecutionRequests = [] }

now point your browser to http://localhost:8080/api/get-previous-name and you should receive a response like:

AiMessage { text = "I already knew that, Tom! You told me earlier, remember? Your name is... (drumroll) ...Tom!" toolExecutionRequests = [] }

which proves that the chat memory actually works and the LLM is able to tell your name from the previous conversation;

Stop the Docker containers

Stop the running container:

podman stop my-jaxrs-app-llm
podman stop ollama
< Back to Guides