The throughput of a Kinesis data stream is determined by the number of shards within the data stream. Follow the steps below to estimate the initial number of shards your data stream needs in provisioned mode. Note that you can dynamically adjust the number of shards within your data stream through resharding.
Estimate the average size of the record written to the data stream in kilobytes (KB), rounded up to the nearest 1 KB. (average_data_size_in_KB)
Estimate the number of records written to the data stream per second. (number_of_records_per_second)
Decide the number of Amazon Kinesis Applications consuming data concurrently and independently from the data stream. (number_of_consumers)
Calculate the incoming write bandwidth in KB (incoming_write_bandwidth_in_KB), which is equal to the average_data_size_in_KB multiplied by the number_of_records_per_second.
Calculate the outgoing read bandwidth in KB (outgoing_read_bandwidth_in_KB), which is equal to the incoming_write_bandwidth_in_KB multiplied by the number_of_consumers.
You can then calculate the initial number of shards (number_of_shards) your data stream needs using the following formula: number_of_shards = max (incoming_write_bandwidth_in_KB/1000, outgoing_read_bandwidth_in_KB/2000)