Class: Kafka::Client

Inherits:
Object
  • Object
show all
Defined in:
lib/kafka/client.rb

Instance Method Summary collapse

Constructor Details

#initialize(seed_brokers:, client_id: "ruby-kafka", logger: nil, connect_timeout: nil, socket_timeout: nil, ssl_ca_cert: nil, ssl_client_cert: nil, ssl_client_cert_key: nil) ⇒ Client

Initializes a new Kafka client.

Parameters:

  • seed_brokers (Array<String>, String)

    the list of brokers used to initialize the client. Either an Array of connections, or a comma separated string of connections. Connections can either be a string of "port:protocol" or a full URI with a scheme. If there's a scheme it's ignored and only host/port are used.

  • client_id (String) (defaults to: "ruby-kafka")

    the identifier for this application.

  • logger (Logger) (defaults to: nil)

    the logger that should be used by the client.

  • connect_timeout (Integer, nil) (defaults to: nil)

    the timeout setting for connecting to brokers. See BrokerPool#initialize.

  • socket_timeout (Integer, nil) (defaults to: nil)

    the timeout setting for socket connections. See BrokerPool#initialize.

  • ssl_ca_cert (String, nil) (defaults to: nil)

    a PEM encoded CA cert to use with an SSL connection.

  • ssl_client_cert (String, nil) (defaults to: nil)

    a PEM encoded client cert to use with an SSL connection. Must be used in combination with ssl_client_cert_key.

  • ssl_client_cert_key (String, nil) (defaults to: nil)

    a PEM encoded client cert key to use with an SSL connection. Must be used in combination with ssl_client_cert.


43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
# File 'lib/kafka/client.rb', line 43

def initialize(seed_brokers:, client_id: "ruby-kafka", logger: nil, connect_timeout: nil, socket_timeout: nil, ssl_ca_cert: nil, ssl_client_cert: nil, ssl_client_cert_key: nil)
  @logger = logger || Logger.new(nil)
  @instrumenter = Instrumenter.new(client_id: client_id)

  ssl_context = build_ssl_context(ssl_ca_cert, ssl_client_cert, ssl_client_cert_key)

  connection_builder = ConnectionBuilder.new(
    client_id: client_id,
    connect_timeout: connect_timeout,
    socket_timeout: socket_timeout,
    ssl_context: ssl_context,
    logger: @logger,
    instrumenter: @instrumenter,
  )

  broker_pool = BrokerPool.new(
    connection_builder: connection_builder,
    logger: @logger,
  )

  @cluster = Cluster.new(
    seed_brokers: normalize_seed_brokers(seed_brokers),
    broker_pool: broker_pool,
    logger: @logger,
  )
end

Instance Method Details

#async_producer(delivery_interval: 0, delivery_threshold: 0, max_queue_size: 1000, **options) ⇒ AsyncProducer

Creates a new AsyncProducer instance.

All parameters allowed by #producer can be passed. In addition to this, a few extra parameters can be passed when creating an async producer.

Parameters:

  • max_queue_size (Integer) (defaults to: 1000)

    the maximum number of messages allowed in the queue.

  • delivery_threshold (Integer) (defaults to: 0)

    if greater than zero, the number of buffered messages that will automatically trigger a delivery.

  • delivery_interval (Integer) (defaults to: 0)

    if greater than zero, the number of seconds between automatic message deliveries.

Returns:

See Also:


135
136
137
138
139
140
141
142
143
144
# File 'lib/kafka/client.rb', line 135

def async_producer(delivery_interval: 0, delivery_threshold: 0, max_queue_size: 1000, **options)
  sync_producer = producer(**options)

  AsyncProducer.new(
    sync_producer: sync_producer,
    delivery_interval: delivery_interval,
    delivery_threshold: delivery_threshold,
    max_queue_size: max_queue_size,
  )
end

#closenil

Closes all connections to the Kafka brokers and frees up used resources.

Returns:

  • (nil)

282
283
284
# File 'lib/kafka/client.rb', line 282

def close
  @cluster.disconnect
end

#consumer(group_id:, session_timeout: 30, offset_commit_interval: 10, offset_commit_threshold: 0, heartbeat_interval: 10) ⇒ Consumer

Creates a new Kafka consumer.

Parameters:

  • group_id (String)

    the id of the group that the consumer should join.

  • session_timeout (Integer) (defaults to: 30)

    the number of seconds after which, if a client hasn't contacted the Kafka cluster, it will be kicked out of the group.

  • offset_commit_interval (Integer) (defaults to: 10)

    the interval between offset commits, in seconds.

  • offset_commit_threshold (Integer) (defaults to: 0)

    the number of messages that can be processed before their offsets are committed. If zero, offset commits are not triggered by message processing.

  • heartbeat_interval (Integer) (defaults to: 10)

    the interval between heartbeats; must be less than the session window.

Returns:


159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
# File 'lib/kafka/client.rb', line 159

def consumer(group_id:, session_timeout: 30, offset_commit_interval: 10, offset_commit_threshold: 0, heartbeat_interval: 10)
  group = ConsumerGroup.new(
    cluster: @cluster,
    logger: @logger,
    group_id: group_id,
    session_timeout: session_timeout,
  )

  offset_manager = OffsetManager.new(
    group: group,
    logger: @logger,
    commit_interval: offset_commit_interval,
    commit_threshold: offset_commit_threshold,
  )

  heartbeat = Heartbeat.new(
    group: group,
    interval: heartbeat_interval,
  )

  Consumer.new(
    cluster: @cluster,
    logger: @logger,
    instrumenter: @instrumenter,
    group: group,
    offset_manager: offset_manager,
    session_timeout: session_timeout,
    heartbeat: heartbeat,
  )
end

#fetch_messages(topic:, partition:, offset: :latest, max_wait_time: 5, min_bytes: 1, max_bytes: 1048576) ⇒ Array<Kafka::FetchedMessage>

Note:

This API is still alpha level. Don't try to use it in production.

Fetches a batch of messages from a single partition. Note that it's possible to get back empty batches.

The starting point for the fetch can be configured with the :offset argument. If you pass a number, the fetch will start at that offset. However, there are two special Symbol values that can be passed instead:

  • :earliest — the first offset in the partition.
  • :latest — the next offset that will be written to, effectively making the call block until there is a new message in the partition.

The Kafka protocol specifies the numeric values of these two options: -2 and -1, respectively. You can also pass in these numbers directly.

Example

When enumerating the messages in a partition, you typically fetch batches sequentially.

offset = :earliest

loop do
  messages = kafka.fetch_messages(
    topic: "my-topic",
    partition: 42,
    offset: offset,
  )

  messages.each do |message|
    puts message.offset, message.key, message.value

    # Set the next offset that should be read to be the subsequent
    # offset.
    offset = message.offset + 1
  end
end

See a working example in examples/simple-consumer.rb.

Parameters:

  • topic (String)

    the topic that messages should be fetched from.

  • partition (Integer)

    the partition that messages should be fetched from.

  • offset (Integer, Symbol) (defaults to: :latest)

    the offset to start reading from. Default is the latest offset.

  • max_wait_time (Integer) (defaults to: 5)

    the maximum amount of time to wait before the server responds, in seconds.

  • min_bytes (Integer) (defaults to: 1)

    the minimum number of bytes to wait for. If set to zero, the broker will respond immediately, but the response may be empty. The default is 1 byte, which means that the broker will respond as soon as a message is written to the partition.

  • max_bytes (Integer) (defaults to: 1048576)

    the maximum number of bytes to include in the response message set. Default is 1 MB. You need to set this higher if you expect messages to be larger than this.

Returns:


251
252
253
254
255
256
257
258
259
260
261
262
# File 'lib/kafka/client.rb', line 251

def fetch_messages(topic:, partition:, offset: :latest, max_wait_time: 5, min_bytes: 1, max_bytes: 1048576)
  operation = FetchOperation.new(
    cluster: @cluster,
    logger: @logger,
    min_bytes: min_bytes,
    max_wait_time: max_wait_time,
  )

  operation.fetch_from_partition(topic, partition, offset: offset, max_bytes: max_bytes)

  operation.execute.flat_map {|batch| batch.messages }
end

#partitions_for(topic) ⇒ Integer

Counts the number of partitions in a topic.

Parameters:

  • topic (String)

Returns:

  • (Integer)

    the number of partitions in the topic.


275
276
277
# File 'lib/kafka/client.rb', line 275

def partitions_for(topic)
  @cluster.partitions_for(topic).count
end

#producer(compression_codec: nil, compression_threshold: 1, ack_timeout: 5, required_acks: 1, max_retries: 2, retry_backoff: 1, max_buffer_size: 1000, max_buffer_bytesize: 10_000_000) ⇒ Kafka::Producer

Initializes a new Kafka producer.

Parameters:

  • ack_timeout (Integer) (defaults to: 5)

    The number of seconds a broker can wait for replicas to acknowledge a write before responding with a timeout.

  • required_acks (Integer) (defaults to: 1)

    The number of replicas that must acknowledge a write.

  • max_retries (Integer) (defaults to: 2)

    the number of retries that should be attempted before giving up sending messages to the cluster. Does not include the original attempt.

  • retry_backoff (Integer) (defaults to: 1)

    the number of seconds to wait between retries.

  • max_buffer_size (Integer) (defaults to: 1000)

    the number of messages allowed in the buffer before new writes will raise BufferOverflow exceptions.

  • max_buffer_bytesize (Integer) (defaults to: 10_000_000)

    the maximum size of the buffer in bytes. attempting to produce messages when the buffer reaches this size will result in BufferOverflow being raised.

  • compression_codec (Symbol, nil) (defaults to: nil)

    the name of the compression codec to use, or nil if no compression should be performed. Valid codecs: :snappy and :gzip.

  • compression_threshold (Integer) (defaults to: 1)

    the number of messages that needs to be in a message set before it should be compressed. Note that message sets are per-partition rather than per-topic or per-producer.

Returns:


100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
# File 'lib/kafka/client.rb', line 100

def producer(compression_codec: nil, compression_threshold: 1, ack_timeout: 5, required_acks: 1, max_retries: 2, retry_backoff: 1, max_buffer_size: 1000, max_buffer_bytesize: 10_000_000)
  compressor = Compressor.new(
    codec_name: compression_codec,
    threshold: compression_threshold,
    instrumenter: @instrumenter,
  )

  Producer.new(
    cluster: @cluster,
    logger: @logger,
    instrumenter: @instrumenter,
    compressor: compressor,
    ack_timeout: ack_timeout,
    required_acks: required_acks,
    max_retries: max_retries,
    retry_backoff: retry_backoff,
    max_buffer_size: max_buffer_size,
    max_buffer_bytesize: max_buffer_bytesize,
  )
end

#topicsArray<String>

Lists all topics in the cluster.

Returns:

  • (Array<String>)

    the list of topic names.


267
268
269
# File 'lib/kafka/client.rb', line 267

def topics
  @cluster.topics
end