diff --git a/_authors/awssamit.markdown b/_authors/awssamit.markdown new file mode 100644 index 0000000000..d86c50b4b8 --- /dev/null +++ b/_authors/awssamit.markdown @@ -0,0 +1,9 @@ +--- +name: Samit Kumbhani +short_name: awssamit +photo: '/assets/media/authors/awssamit.jpg' +github: awssamit +linkedin: https://www.linkedin.com/in/samitkumbhani/ +--- + +Samit is an AWS Sr. Solutions Architect in the New York City area. He has 18 years of experience building applications and focuses on Analytics, Business Intelligence, and Databases. He enjoys working with customers to understand and solve their challenges by creating innovative solutions using AWS services. Outside of work Samit loves playing cricket, traveling and biking. diff --git a/_events/2023-1114-community-meeting.markdown b/_events/2023-1114-community-meeting.markdown index b8d08b2495..850376df7c 100644 --- a/_events/2023-1114-community-meeting.markdown +++ b/_events/2023-1114-community-meeting.markdown @@ -1,6 +1,6 @@ --- -eventdate: 2023-11-14 15:00:00 -0700 +eventdate: 2023-11-14 15:00:00 -0800 title: OpenSearch Community Meeting - 2023-11-14 online: true diff --git a/_posts/2023-10-28-optimize-refresh-interval.markdown b/_posts/2023-10-28-optimize-refresh-interval.markdown new file mode 100644 index 0000000000..36f372b12f --- /dev/null +++ b/_posts/2023-10-28-optimize-refresh-interval.markdown @@ -0,0 +1,77 @@ +--- +layout: post +title: "Optimize OpenSearch Refresh Interval" +authors: + - ev2900 + - awssamit +date: 2023-11-13 +categories: + - technical-posts +excerpt: Learn how to optimize the refresh interval of an OpenSearch index and strike a balance between the speed at which indexed information is available for search with CPU and I/O costs +meta_description: Learn how to optimize the refresh interval of an OpenSearch index and strike a balance between the speed at which indexed information is available for search with CPU and I/O costs +meta_keywords: OpenSearch refresh interval, refresh interval optimization, optimize OpenSearch index performance +--- +This blog post discusses optimizing the refresh interval of an OpenSearch index and how the optimization enhances OpenSearch performance. + + +## Introduction +In OpenSearch, the process of indexing documents initially places them into a memory buffer. At this stage, the documents are not yet searchable. To make these documents searchable, a refresh operation is required. This operation transfers the documents from the memory buffer to new segments. Segments are specific data structures that OpenSearch uses to store and retrieve documents. Once the documents are housed in these segments, they become searchable. + + +The refresh operation, which enables documents to become searchable by moving them into segments, is managed automatically by OpenSearch. By default, OpenSearch refreshes indexes that have received one or more search requests in the past 30 seconds, every 1 second. This means that documents written to an active index should typically become searchable within 1 second of being written to OpenSearch. While the default refresh frequency for an index is set to 1 second, this setting can be adjusted on a per-index basis. + +## Why adjust the default index refresh interval +Refresh operations are resource intensive. The procedure of transferring data into new segments and rendering them searchable demands CPU, memory, and input/output (I/O) resources. Consequently, fewer refresh operations can conserve these resources for other tasks. + + +However, less frequent refreshes also imply a longer wait for newly indexed documents to become searchable. If your use case necessitates near real-time searching of new data, infrequent refreshes may not be appropriate. On the other hand, if your operations can accommodate a delay between the indexing of data and its searching, reducing the frequency of refreshes can liberate resources. This could potentially lead to increased indexing throughput and faster indexing speeds. + +## View the refresh interval +The frequency of refresh operations is dictated by the refresh interval set for an OpenSearch index. By default, the refresh interval for an index is set to 1 second. This implies that a refresh operation will be executed every second, provided the index is active. An index is considered active if it has received one or more search requests within the last 30 seconds. + +Assuming we are using an index named ```sample_data``` we can check what the refresh interval is for this index by running the following API command: + + +```GET /sample_data/_settings/index.refresh_interval``` + +In this example the refresh interval of the ```sample_data``` index is 1 second. + + +get refresh{: .img-fluid } + +Note that if a refresh interval is not manually set, the API call may not return any results. The default refresh interval is 1 second, but this property is not automatically added to the ```_settings``` API response unless it is manually set or adjusted. + + +## Change the refresh interval +You can adjust the refresh interval for an index by using ```_settings API```. In the following example, the refresh interval of the ```sample_data``` is set to 60 seconds: + + +``` +PUT /sample_data/_settings +{ + "index" : { + "refresh_interval" : "60s" + } +} +``` + +change refresh{: .img-fluid } + +It is also possible to disable automatic refreshes. Setting ```"refresh_interval" : "-1"``` will disable any automatic refreshing. In this scenario, an index will need to be refreshed manually using the ```_settings``` API. + + +The following example API call manually triggers a refresh on the index ```sample_data```: + + +```POST sample_data/_refresh``` + +manual refresh{: .img-fluid } + +You have the option to disable automatic refreshes prior to initiating a known write-intensive workload and then manually trigger a refresh upon its completion. For instance, if you're uploading new data to OpenSearch daily through a batch process, it might be beneficial to disable automatic refreshes just before the batch process begins. After the process concludes, you can manually initiate a refresh. + + +## Conclusion and other resources +Modifying the default refresh interval to strike a balance between the speed at which new documents become searchable and the CPU and I/O costs of the refresh operation can enhance OpenSearch performance. While a shorter refresh interval, which implies more frequent refreshes, allows documents to become searchable more rapidly post-indexing, it does so at the expense of increased resource utilization. + +If you prefer to learn about this topic in the format of a video instead of a blog post, check out the YouTube video [OpenSearch - How to change the refresh interval of an index](https://www.youtube.com/watch?v=8uyemEfgcY8). This blog post is based on the GitHub repository [OpenSearch_Refresh_Interval](https://github.com/ev2900/OpenSearch_Refresh_Interval). + diff --git a/assets/media/authors/awssamit.jpg b/assets/media/authors/awssamit.jpg new file mode 100644 index 0000000000..f618c84329 Binary files /dev/null and b/assets/media/authors/awssamit.jpg differ diff --git a/assets/media/blog-images/2023-10-28-optimize-refresh-interval/change_refresh_1.png b/assets/media/blog-images/2023-10-28-optimize-refresh-interval/change_refresh_1.png new file mode 100644 index 0000000000..ff3a4795d9 Binary files /dev/null and b/assets/media/blog-images/2023-10-28-optimize-refresh-interval/change_refresh_1.png differ diff --git a/assets/media/blog-images/2023-10-28-optimize-refresh-interval/get_refresh_0.png b/assets/media/blog-images/2023-10-28-optimize-refresh-interval/get_refresh_0.png new file mode 100644 index 0000000000..39f0c498af Binary files /dev/null and b/assets/media/blog-images/2023-10-28-optimize-refresh-interval/get_refresh_0.png differ diff --git a/assets/media/blog-images/2023-10-28-optimize-refresh-interval/manual_refresh_2.png b/assets/media/blog-images/2023-10-28-optimize-refresh-interval/manual_refresh_2.png new file mode 100644 index 0000000000..a4ec4d4198 Binary files /dev/null and b/assets/media/blog-images/2023-10-28-optimize-refresh-interval/manual_refresh_2.png differ