修改实例配置
更新时间:2026-04-21
概述
Milvus 引擎实例提供了通过控制台的方式查询和修改实例的配置信息。本文将介绍如何在控制台上更新 Milvus 引擎实例配置,以满足不同的业务需求。
前提条件
- 已创建 Milvus 引擎实例。
- 实例状态为运行中。
操作步骤
按以下步骤在控制台上更新 Milvus 引擎实例的配置信息:
- 登录云管理控制台,选择"产品服务 > 数据库 > 向量数据库 VectorDB"。
- 选择云服务器所在的区域。
- 在实例列表中找到对应的 Milvus 引擎实例,点击实例名称进入实例详情页。
- 在左侧导航栏选择"实例配置"页签。
-
在"实例配置"输入框内输入需要覆盖默认配置的参数,然后点击"保存配置"。
- 参数格式:配置参数需遵循 YAML 格式。
- 在弹出的"提示"对话框中,输入变更原因,点击"确定"。
说明:配置修改请求提交后,若所修改配置项需要重启以生效,则在配置修改完成后将重启实例。此时实例将暂时进入升级中状态,待配置更新完成后,集群将自动恢复至运行中状态。
YAML
1# Related configuration of rootCoord, used to handle data definition language (DDL) and data control language (DCL) requests
2rootCoord:
3 maxDatabaseNum: 64 # Maximum number of database
4 maxPartitionNum: 4096 # Maximum number of partitions in a collection
5 minSegmentSizeToEnableIndex: 1024 # It's a threshold. When the segment size is less than this value, the segment will not be indexed
6 importTaskExpiration: 900 # (in seconds) Duration after which an import task will expire (be killed). Default 900 seconds (15 minutes).
7 importTaskRetention: 86400 # (in seconds) Milvus will keep the record of import tasks for at least `importTaskRetention` seconds. Default 86400, seconds (24 hours).
8 grpc:
9 serverMaxSendSize: 536870912
10 serverMaxRecvSize: 268435456
11 clientMaxSendSize: 268435456
12 clientMaxRecvSize: 536870912
13
14# Related configuration of proxy, used to validate client requests and reduce the returned results.
15proxy:
16 timeTickInterval: 200 # ms, the interval that proxy synchronize the time tick
17 healthCheckTimeout: 3000 # ms, the interval that to do component healthy check
18 maxNameLength: 255 # Maximum length of name for a collection or alias
19 # Maximum number of fields in a collection.
20 # As of today (2.2.0 and after) it is strongly DISCOURAGED to set maxFieldNum >= 64.
21 # So adjust at your risk!
22 maxFieldNum: 64
23 maxTaskNum: 1024 # max task number of proxy task queue
24 grpc:
25 serverMaxSendSize: 268435456
26 serverMaxRecvSize: 67108864
27 clientMaxSendSize: 268435456
28 clientMaxRecvSize: 67108864
29
30# Related configuration of queryCoord, used to manage topology and load balancing for the query nodes, and handoff from growing segments to sealed segments.
31queryCoord:
32 autoHandoff: true # Enable auto handoff
33 autoBalance: true # Enable auto balance
34 balancer: ScoreBasedBalancer # Balancer to use
35 overloadedMemoryThresholdPercentage: 90 # The threshold percentage that memory overload
36 balanceIntervalSeconds: 60
37 memoryUsageMaxDifferencePercentage: 30
38 checkInterval: 1000
39 channelTaskTimeout: 60000 # 1 minute
40 segmentTaskTimeout: 120000 # 2 minute
41 distPullInterval: 500
42 heartbeatAvailableInterval: 10000 # 10s, Only QueryNodes which fetched heartbeats within the duration are available
43 loadTimeoutSeconds: 600
44 checkHandoffInterval: 5000
45 grpc:
46 serverMaxSendSize: 536870912
47 serverMaxRecvSize: 268435456
48 clientMaxSendSize: 268435456
49 clientMaxRecvSize: 536870912
50
51# Related configuration of queryNode, used to run hybrid search between vector and scalar data.
52queryNode:
53 dataSync:
54 flowGraph:
55 maxQueueLength: 16 # Maximum length of task queue in flowgraph
56 maxParallelism: 1024 # Maximum number of tasks executed in parallel in the flowgraph
57 stats:
58 publishInterval: 1000 # Interval for querynode to report node information (milliseconds)
59 segcore:
60 cgoPoolSizeRatio: 2.0 # cgo pool size ratio to max read concurrency
61 knowhereThreadPoolNumRatio: 4
62 # Use more threads to make better use of SSD throughput in disk index.
63 # This parameter is only useful when enable-disk = true.
64 # And this value should be a number greater than 1 and less than 32.
65 chunkRows: 128 # The number of vectors in a chunk.
66 exprEvalBatchSize: 8192 # The batch size for executor get next
67 interimIndex: # build a vector temperate index for growing segment or binlog to accelerate search
68 enableIndex: true
69 nlist: 128 # segment index nlist
70 nprobe: 16 # nprobe to search segment, based on your accuracy requirement, must smaller than nlist
71 memExpansionRate: 1.15 # the ratio of building interim index memory usage to raw data
72 loadMemoryUsageFactor: 1 # The multiply factor of calculating the memory usage while loading segments
73 enableDisk: false # enable querynode load disk index, and search on disk index
74 maxDiskUsagePercentage: 95
75 grouping:
76 enabled: true
77 maxNQ: 1000
78 topKMergeRatio: 20
79 scheduler:
80 receiveChanSize: 10240
81 unsolvedQueueSize: 10240
82 # maxReadConcurrentRatio is the concurrency ratio of read task (search task and query task).
83 # Max read concurrency would be the value of runtime.NumCPU * maxReadConcurrentRatio.
84 # It defaults to 2.0, which means max read concurrency would be the value of runtime.NumCPU * 2.
85 # Max read concurrency must greater than or equal to 1, and less than or equal to runtime.NumCPU * 100.
86 # (0, 100]
87 maxReadConcurrentRatio: 1
88 cpuRatio: 10 # ratio used to estimate read task cpu usage.
89 maxTimestampLag: 86400
90 # read task schedule policy: fifo(by default), user-task-polling.
91 scheduleReadPolicy:
92 # fifo: A FIFO queue support the schedule.
93 # user-task-polling:
94 # The user's tasks will be polled one by one and scheduled.
95 # Scheduling is fair on task granularity.
96 # The policy is based on the username for authentication.
97 # And an empty username is considered the same user.
98 # When there are no multi-users, the policy decay into FIFO
99 name: fifo
100 maxPendingTask: 10240
101 # user-task-polling configure:
102 taskQueueExpire: 60 # 1 min by default, expire time of inner user task queue since queue is empty.
103 enableCrossUserGrouping: false # false by default Enable Cross user grouping when using user-task-polling policy. (close it if task of any user can not merge others).
104 maxPendingTaskPerUser: 1024 # 50 by default, max pending task in scheduler per user.
105 grpc:
106 serverMaxSendSize: 536870912
107 serverMaxRecvSize: 268435456
108 clientMaxSendSize: 268435456
109 clientMaxRecvSize: 536870912
110
111indexCoord:
112 bindIndexNodeMode:
113 enable: false
114 withCred: false
115 segment:
116 minSegmentNumRowsToEnableIndex: 1024 # It's a threshold. When the segment num rows is less than this value, the segment will not be indexed
117
118indexNode:
119 scheduler:
120 buildParallel: 1
121 enableDisk: true # enable index node build disk vector index
122 maxDiskUsagePercentage: 95
123 grpc:
124 serverMaxSendSize: 536870912
125 serverMaxRecvSize: 268435456
126 clientMaxSendSize: 268435456
127 clientMaxRecvSize: 536870912
128
129dataCoord:
130 channel:
131 watchTimeoutInterval: 300 # Timeout on watching channels (in seconds). Datanode tickler update watch progress will reset timeout timer.
132 balanceSilentDuration: 300 # The duration before the channelBalancer on datacoord to run
133 balanceInterval: 360 #The interval for the channelBalancer on datacoord to check balance status
134 segment:
135 maxSize: 1024 # Maximum size of a segment in MB
136 diskSegmentMaxSize: 2048 # Maximum size of a segment in MB for collection which has Disk index
137 sealProportion: 0.12
138 # The time of the assignment expiration in ms
139 # Warning! this parameter is an expert variable and closely related to data integrity. Without specific
140 # target and solid understanding of the scenarios, it should not be changed. If it's necessary to alter
141 # this parameter, make sure that the newly changed value is larger than the previous value used before restart
142 # otherwise there could be a large possibility of data loss
143 assignmentExpiration: 2000
144 maxLife: 86400 # The max lifetime of segment in seconds, 24*60*60
145 # If a segment didn't accept dml records in maxIdleTime and the size of segment is greater than
146 # minSizeFromIdleToSealed, Milvus will automatically seal it.
147 # The max idle time of segment in seconds, 10*60.
148 maxIdleTime: 600
149 minSizeFromIdleToSealed: 16 # The min size in MB of segment which can be idle from sealed.
150 # The max number of binlog file for one segment, the segment will be sealed if
151 # the number of binlog file reaches to max value.
152 maxBinlogFileNumber: 32
153 smallProportion: 0.5 # The segment is considered as "small segment" when its # of rows is smaller than
154 # (smallProportion * segment max # of rows).
155 # A compaction will happen on small segments if the segment after compaction will have
156 compactableProportion: 0.85
157 # over (compactableProportion * segment max # of rows) rows.
158 # MUST BE GREATER THAN OR EQUAL TO <smallProportion>!!!
159 # During compaction, the size of segment # of rows is able to exceed segment max # of rows by (expansionRate-1) * 100%.
160 expansionRate: 1.25
161 # Whether to enable levelzero segment
162 enableLevelZero: false
163 enableCompaction: true # Enable data segment compaction
164 compaction:
165 enableAutoCompaction: true
166 rpcTimeout: 10 # compaction rpc request timeout in seconds
167 maxParallelTaskNum: 10 # max parallel compaction task number
168 indexBasedCompaction: true
169
170 levelzero:
171 forceTrigger:
172 minSize: 8 # The minmum size in MB to force trigger a LevelZero Compaction
173 deltalogMinNum: 10 # the minimum number of deltalog files to force trigger a LevelZero Compaction
174
175 enableGarbageCollection: true
176 gc:
177 interval: 3600 # gc interval in seconds
178 missingTolerance: 3600 # file meta missing tolerance duration in seconds, 3600
179 dropTolerance: 10800 # file belongs to dropped entity tolerance duration in seconds. 10800
180 enableActiveStandby: false
181 grpc:
182 serverMaxSendSize: 536870912
183 serverMaxRecvSize: 268435456
184 clientMaxSendSize: 268435456
185 clientMaxRecvSize: 536870912
186
187dataNode:
188 dataSync:
189 flowGraph:
190 maxQueueLength: 16 # Maximum length of task queue in flowgraph
191 maxParallelism: 1024 # Maximum number of tasks executed in parallel in the flowgraph
192 maxParallelSyncMgrTasks: 256 #The max concurrent sync task number of datanode sync mgr globally
193 skipMode:
194 # when there are only timetick msg in flowgraph for a while (longer than coldTime),
195 # flowGraph will turn on skip mode to skip most timeticks to reduce cost, especially there are a lot of channels
196 enable: true
197 skipNum: 4
198 coldTime: 60
199 segment:
200 insertBufSize: 16777216 # Max buffer size to flush for a single segment.
201 deleteBufBytes: 67108864 # Max buffer size to flush del for a single channel
202 syncPeriod: 600 # The period to sync segments if buffer is not empty.
203 # can specify ip for example
204 # ip: 127.0.0.1
205 grpc:
206 serverMaxSendSize: 536870912
207 serverMaxRecvSize: 268435456
208 clientMaxSendSize: 268435456
209 clientMaxRecvSize: 536870912
210 memory:
211 forceSyncEnable: true # `true` to force sync if memory usage is too high
212 forceSyncSegmentNum: 1 # number of segments to sync, segments with top largest buffer will be synced.
213 watermarkStandalone: 0.2 # memory watermark for standalone, upon reaching this watermark, segments will be synced.
214 watermarkCluster: 0.5 # memory watermark for cluster, upon reaching this watermark, segments will be synced.
215 timetick:
216 byRPC: true
217 channel:
218 # specify the size of global work pool of all channels
219 # if this parameter <= 0, will set it as the maximum number of CPUs that can be executing
220 # suggest to set it bigger on large collection numbers to avoid blocking
221 workPoolSize: -1
222 # specify the size of global work pool for channel checkpoint updating
223 # if this parameter <= 0, will set it as 1000
224 # suggest to set it bigger on large collection numbers to avoid blocking
225 updateChannelCheckpointMaxParallel: 1000
226
227grpc:
228 client:
229 compressionEnabled: false
230 dialTimeout: 200
231 keepAliveTime: 10000
232 keepAliveTimeout: 20000
233 maxMaxAttempts: 10
234 initialBackOff: 0.2 # seconds
235 maxBackoff: 10 # seconds
236
237quotaAndLimits:
238 enabled: true # `true` to enable quota and limits, `false` to disable.
239 limits:
240 maxCollectionNum: 65536
241 maxCollectionNumPerDB: 65536
242 # quotaCenterCollectInterval is the time interval that quotaCenter
243 # collects metrics from Proxies, Query cluster and Data cluster.
244 # seconds, (0 ~ 65536)
245 quotaCenterCollectInterval: 3
246 ddl:
247 enabled: false
248 collectionRate: -1 # qps, default no limit, rate for CreateCollection, DropCollection, LoadCollection, ReleaseCollection
249 partitionRate: -1 # qps, default no limit, rate for CreatePartition, DropPartition, LoadPartition, ReleasePartition
250 indexRate:
251 enabled: false
252 max: -1 # qps, default no limit, rate for CreateIndex, DropIndex
253 flushRate:
254 enabled: false
255 max: -1 # qps, default no limit, rate for flush
256 compactionRate:
257 enabled: false
258 max: -1 # qps, default no limit, rate for manualCompaction
259 dml:
260 # dml limit rates, default no limit.
261 # The maximum rate will not be greater than max.
262 enabled: false
263 insertRate:
264 collection:
265 max: -1 # MB/s, default no limit
266 max: -1 # MB/s, default no limit
267 upsertRate:
268 collection:
269 max: -1 # MB/s, default no limit
270 max: -1 # MB/s, default no limit
271 deleteRate:
272 collection:
273 max: -1 # MB/s, default no limit
274 max: -1 # MB/s, default no limit
275 bulkLoadRate:
276 collection:
277 max: -1 # MB/s, default no limit, not support yet. TODO: limit bulkLoad rate
278 max: -1 # MB/s, default no limit, not support yet. TODO: limit bulkLoad rate
279 dql:
280 # dql limit rates, default no limit.
281 # The maximum rate will not be greater than max.
282 enabled: false
283 searchRate:
284 collection:
285 max: -1 # vps (vectors per second), default no limit
286 max: -1 # vps (vectors per second), default no limit
287 queryRate:
288 collection:
289 max: -1 # qps, default no limit
290 max: -1 # qps, default no limit
291 limitWriting:
292 # forceDeny false means dml requests are allowed (except for some
293 # specific conditions, such as memory of nodes to water marker), true means always reject all dml requests.
294 forceDeny: false
295 ttProtection:
296 enabled: false
297 # maxTimeTickDelay indicates the backpressure for DML Operations.
298 # DML rates would be reduced according to the ratio of time tick delay to maxTimeTickDelay,
299 # if time tick delay is greater than maxTimeTickDelay, all DML requests would be rejected.
300 # seconds
301 maxTimeTickDelay: 300
302 memProtection:
303 # When memory usage > memoryHighWaterLevel, all dml requests would be rejected;
304 # When memoryLowWaterLevel < memory usage < memoryHighWaterLevel, reduce the dml rate;
305 # When memory usage < memoryLowWaterLevel, no action.
306 enabled: true
307 dataNodeMemoryLowWaterLevel: 0.85 # (0, 1], memoryLowWaterLevel in DataNodes
308 dataNodeMemoryHighWaterLevel: 0.95 # (0, 1], memoryHighWaterLevel in DataNodes
309 queryNodeMemoryLowWaterLevel: 0.85 # (0, 1], memoryLowWaterLevel in QueryNodes
310 queryNodeMemoryHighWaterLevel: 0.95 # (0, 1], memoryHighWaterLevel in QueryNodes
311 growingSegmentsSizeProtection:
312 # No action will be taken if the growing segments size is less than the low watermark.
313 # When the growing segments size exceeds the low watermark, the dml rate will be reduced,
314 # but the rate will not be lower than `minRateRatio * dmlRate`.
315 enabled: false
316 minRateRatio: 0.5
317 lowWaterLevel: 0.2
318 highWaterLevel: 0.4
319 diskProtection:
320 enabled: true # When the total file size of object storage is greater than `diskQuota`, all dml requests would be rejected;
321 diskQuota: -1 # MB, (0, +inf), default no limit
322 diskQuotaPerCollection: -1 # MB, (0, +inf), default no limit
323 limitReading:
324 # forceDeny false means dql requests are allowed (except for some
325 # specific conditions, such as collection has been dropped), true means always reject all dql requests.
326 forceDeny: false
327 queueProtection:
328 enabled: false
329 # nqInQueueThreshold indicated that the system was under backpressure for Search/Query path.
330 # If NQ in any QueryNode's queue is greater than nqInQueueThreshold, search&query rates would gradually cool off
331 # until the NQ in queue no longer exceeds nqInQueueThreshold. We think of the NQ of query request as 1.
332 # int, default no limit
333 nqInQueueThreshold: -1
334 # queueLatencyThreshold indicated that the system was under backpressure for Search/Query path.
335 # If dql latency of queuing is greater than queueLatencyThreshold, search&query rates would gradually cool off
336 # until the latency of queuing no longer exceeds queueLatencyThreshold.
337 # The latency here refers to the averaged latency over a period of time.
338 # milliseconds, default no limit
339 queueLatencyThreshold: -1
340 resultProtection:
341 enabled: false
342 # maxReadResultRate indicated that the system was under backpressure for Search/Query path.
343 # If dql result rate is greater than maxReadResultRate, search&query rates would gradually cool off
344 # until the read result rate no longer exceeds maxReadResultRate.
345 # MB/s, default no limit
346 maxReadResultRate: -1
347 # colOffSpeed is the speed of search&query rates cool off.
348 # (0, 1]
349 coolOffSpeed: 0.9
评价此篇文章
