Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New blog, author, event edit #2423

Merged
merged 29 commits into from
Nov 13, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
f8a76b2
Add files via upload
ev2900 Oct 28, 2023
e65eb8e
Add files via upload
ev2900 Oct 28, 2023
7c00860
Delete _posts/optimize-refresh-interval.markdown
ev2900 Oct 28, 2023
0c5ff09
Add files via upload
ev2900 Oct 28, 2023
72b28f3
Update 2023-10-28-optimize-refresh-interval.markdown
ev2900 Oct 28, 2023
a466a24
uploading image for awssamit
awssamit Oct 30, 2023
9efd797
Create awssamit.markdown
awssamit Oct 30, 2023
1778a92
Update 2023-10-28-optimize-refresh-interval.markdown
ev2900 Oct 31, 2023
d09fa7e
Update _posts/2023-10-28-optimize-refresh-interval.markdown
ev2900 Nov 1, 2023
f3e883a
Update _posts/2023-10-28-optimize-refresh-interval.markdown
ev2900 Nov 1, 2023
000394c
Update _posts/2023-10-28-optimize-refresh-interval.markdown
ev2900 Nov 1, 2023
3058c21
Update _posts/2023-10-28-optimize-refresh-interval.markdown
ev2900 Nov 1, 2023
23e544b
Update _posts/2023-10-28-optimize-refresh-interval.markdown
ev2900 Nov 1, 2023
f70428b
Update _posts/2023-10-28-optimize-refresh-interval.markdown
ev2900 Nov 1, 2023
d13357e
Update _posts/2023-10-28-optimize-refresh-interval.markdown
ev2900 Nov 1, 2023
66c29b6
Update _posts/2023-10-28-optimize-refresh-interval.markdown
ev2900 Nov 1, 2023
17cd19e
Update _posts/2023-10-28-optimize-refresh-interval.markdown
ev2900 Nov 1, 2023
29b4f77
Update _posts/2023-10-28-optimize-refresh-interval.markdown
ev2900 Nov 1, 2023
98ea814
Update _posts/2023-10-28-optimize-refresh-interval.markdown
ev2900 Nov 9, 2023
d5d0e18
Update _posts/2023-10-28-optimize-refresh-interval.markdown
ev2900 Nov 9, 2023
232b0f6
Update _posts/2023-10-28-optimize-refresh-interval.markdown
ev2900 Nov 9, 2023
3e08fa2
Update _posts/2023-10-28-optimize-refresh-interval.markdown
ev2900 Nov 9, 2023
3fe5335
For some reason the lack of an "excerpt" frontmatter has caused the e…
nateynateynate Nov 13, 2023
38c4530
Merge pull request #2399 from awssamit/main
nateynateynate Nov 13, 2023
4672f85
Fix date of next community meeting
smortex Nov 13, 2023
2e5f41a
Merge pull request #2397 from ev2900/main
krisfreedain Nov 13, 2023
0f892b1
Merge pull request #2422 from smortex/fix-next-community-meeting-date
krisfreedain Nov 13, 2023
79d330a
Changing date to today's date so as to feature it.
nateynateynate Nov 13, 2023
0b2adfe
Merge pull request #2424 from nateynateynate/fix-blog
nateynateynate Nov 13, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 9 additions & 0 deletions _authors/awssamit.markdown
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
---
name: Samit Kumbhani
short_name: awssamit
photo: '/assets/media/authors/awssamit.jpg'
github: awssamit
linkedin: https://www.linkedin.com/in/samitkumbhani/
---

Samit is an AWS Sr. Solutions Architect in the New York City area. He has 18 years of experience building applications and focuses on Analytics, Business Intelligence, and Databases. He enjoys working with customers to understand and solve their challenges by creating innovative solutions using AWS services. Outside of work Samit loves playing cricket, traveling and biking.

Check failure on line 9 in _authors/awssamit.markdown

View workflow job for this annotation

GitHub Actions / vale

[vale] _authors/awssamit.markdown#L9

[OpenSearch.Spelling] Error: Samit. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.
Raw output
{"message": "[OpenSearch.Spelling] Error: Samit. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.", "location": {"path": "_authors/awssamit.markdown", "range": {"start": {"line": 9, "column": 1}}}, "severity": "ERROR"}

Check failure on line 9 in _authors/awssamit.markdown

View workflow job for this annotation

GitHub Actions / vale

[vale] _authors/awssamit.markdown#L9

[OpenSearch.Spelling] Error: Samit. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.
Raw output
{"message": "[OpenSearch.Spelling] Error: Samit. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.", "location": {"path": "_authors/awssamit.markdown", "range": {"start": {"line": 9, "column": 327}}}, "severity": "ERROR"}

Check warning on line 9 in _authors/awssamit.markdown

View workflow job for this annotation

GitHub Actions / vale

[vale] _authors/awssamit.markdown#L9

[OpenSearch.OxfordComma] Add an Oxford comma in 'cricket, traveling and biking.'.
Raw output
{"message": "[OpenSearch.OxfordComma] Add an Oxford comma in 'cricket, traveling and biking.'.", "location": {"path": "_authors/awssamit.markdown", "range": {"start": {"line": 9, "column": 347}}}, "severity": "WARNING"}
2 changes: 1 addition & 1 deletion _events/2023-1114-community-meeting.markdown
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---

eventdate: 2023-11-14 15:00:00 -0700
eventdate: 2023-11-14 15:00:00 -0800

title: OpenSearch Community Meeting - 2023-11-14
online: true
Expand Down
77 changes: 77 additions & 0 deletions _posts/2023-10-28-optimize-refresh-interval.markdown
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
---
layout: post
title: "Optimize OpenSearch Refresh Interval"
authors:
- ev2900
- awssamit
date: 2023-11-13
categories:
- technical-posts
excerpt: Learn how to optimize the refresh interval of an OpenSearch index and strike a balance between the speed at which indexed information is available for search with CPU and I/O costs
meta_description: Learn how to optimize the refresh interval of an OpenSearch index and strike a balance between the speed at which indexed information is available for search with CPU and I/O costs
meta_keywords: OpenSearch refresh interval, refresh interval optimization, optimize OpenSearch index performance
---
This blog post discusses optimizing the refresh interval of an OpenSearch index and how the optimization enhances OpenSearch performance.


## Introduction
In OpenSearch, the process of indexing documents initially places them into a memory buffer. At this stage, the documents are not yet searchable. To make these documents searchable, a refresh operation is required. This operation transfers the documents from the memory buffer to new segments. Segments are specific data structures that OpenSearch uses to store and retrieve documents. Once the documents are housed in these segments, they become searchable.


The refresh operation, which enables documents to become searchable by moving them into segments, is managed automatically by OpenSearch. By default, OpenSearch refreshes indexes that have received one or more search requests in the past 30 seconds, every 1 second. This means that documents written to an active index should typically become searchable within 1 second of being written to OpenSearch. While the default refresh frequency for an index is set to 1 second, this setting can be adjusted on a per-index basis.

## Why adjust the default index refresh interval
Refresh operations are resource intensive. The procedure of transferring data into new segments and rendering them searchable demands CPU, memory, and input/output (I/O) resources. Consequently, fewer refresh operations can conserve these resources for other tasks.


However, less frequent refreshes also imply a longer wait for newly indexed documents to become searchable. If your use case necessitates near real-time searching of new data, infrequent refreshes may not be appropriate. On the other hand, if your operations can accommodate a delay between the indexing of data and its searching, reducing the frequency of refreshes can liberate resources. This could potentially lead to increased indexing throughput and faster indexing speeds.

## View the refresh interval
The frequency of refresh operations is dictated by the refresh interval set for an OpenSearch index. By default, the refresh interval for an index is set to 1 second. This implies that a refresh operation will be executed every second, provided the index is active. An index is considered active if it has received one or more search requests within the last 30 seconds.

Assuming we are using an index named ```sample_data``` we can check what the refresh interval is for this index by running the following API command:


```GET /sample_data/_settings/index.refresh_interval```

In this example the refresh interval of the ```sample_data``` index is 1 second.


<img src="/assets/media/blog-images/2023-10-28-optimize-refresh-interval/get_refresh_0.png" alt="get refresh"/>{: .img-fluid }

Note that if a refresh interval is not manually set, the API call may not return any results. The default refresh interval is 1 second, but this property is not automatically added to the ```_settings``` API response unless it is manually set or adjusted.


## Change the refresh interval
You can adjust the refresh interval for an index by using ```_settings API```. In the following example, the refresh interval of the ```sample_data``` is set to 60 seconds:


```
PUT /sample_data/_settings
{
"index" : {
"refresh_interval" : "60s"
}
}
```

<img src="/assets/media/blog-images/2023-10-28-optimize-refresh-interval/change_refresh_1.png" alt="change refresh"/>{: .img-fluid }

It is also possible to disable automatic refreshes. Setting ```"refresh_interval" : "-1"``` will disable any automatic refreshing. In this scenario, an index will need to be refreshed manually using the ```_settings``` API.


The following example API call manually triggers a refresh on the index ```sample_data```:


```POST sample_data/_refresh```

<img src="/assets/media/blog-images/2023-10-28-optimize-refresh-interval/manual_refresh_2.png" alt="manual refresh"/>{: .img-fluid }

You have the option to disable automatic refreshes prior to initiating a known write-intensive workload and then manually trigger a refresh upon its completion. For instance, if you're uploading new data to OpenSearch daily through a batch process, it might be beneficial to disable automatic refreshes just before the batch process begins. After the process concludes, you can manually initiate a refresh.


## Conclusion and other resources
Modifying the default refresh interval to strike a balance between the speed at which new documents become searchable and the CPU and I/O costs of the refresh operation can enhance OpenSearch performance. While a shorter refresh interval, which implies more frequent refreshes, allows documents to become searchable more rapidly post-indexing, it does so at the expense of increased resource utilization.

If you prefer to learn about this topic in the format of a video instead of a blog post, check out the YouTube video [OpenSearch - How to change the refresh interval of an index](https://www.youtube.com/watch?v=8uyemEfgcY8). This blog post is based on the GitHub repository [OpenSearch_Refresh_Interval](https://github.com/ev2900/OpenSearch_Refresh_Interval).

Binary file added assets/media/authors/awssamit.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.