修改实例配置

更新时间：2026-04-21
概述

Milvus 引擎实例提供了通过控制台的方式查询和修改实例的配置信息。本文将介绍如何在控制台上更新 Milvus 引擎实例配置，以满足不同的业务需求。
前提条件

已创建 Milvus 引擎实例。
实例状态为运行中。
操作步骤

按以下步骤在控制台上更新 Milvus 引擎实例的配置信息：
登录云管理控制台，选择"产品服务 > 数据库 > 向量数据库 VectorDB"。
选择云服务器所在的区域。
在实例列表中找到对应的 Milvus 引擎实例，点击实例名称进入实例详情页。
在左侧导航栏选择"实例配置"页签。
在"实例配置"输入框内输入需要覆盖默认配置的参数，然后点击"保存配置"。
- 参数格式：配置参数需遵循 YAML 格式。
在弹出的"提示"对话框中，输入变更原因，点击"确定"。
说明：配置修改请求提交后，若所修改配置项需要重启以生效，则在配置修改完成后将重启实例。此时实例将暂时进入升级中状态，待配置更新完成后，集群将自动恢复至运行中状态。
YAML
1# Related configuration of rootCoord, used to handle data definition language (DDL) and data control language (DCL) requests
2rootCoord:
3  maxDatabaseNum: 64 # Maximum number of database
4  maxPartitionNum: 4096 # Maximum number of partitions in a collection
5  minSegmentSizeToEnableIndex: 1024 # It's a threshold. When the segment size is less than this value, the segment will not be indexed
6  importTaskExpiration: 900 # (in seconds) Duration after which an import task will expire (be killed). Default 900 seconds (15 minutes).
7  importTaskRetention: 86400 # (in seconds) Milvus will keep the record of import tasks for at least `importTaskRetention` seconds. Default 86400, seconds (24 hours).
8  grpc:
9    serverMaxSendSize: 536870912
10    serverMaxRecvSize: 268435456
11    clientMaxSendSize: 268435456
12    clientMaxRecvSize: 536870912
13    
14# Related configuration of proxy, used to validate client requests and reduce the returned results.
15proxy:
16  timeTickInterval: 200 # ms, the interval that proxy synchronize the time tick
17  healthCheckTimeout: 3000 # ms, the interval that to do component healthy check
18  maxNameLength: 255 # Maximum length of name for a collection or alias
19  # Maximum number of fields in a collection.
20  # As of today (2.2.0 and after) it is strongly DISCOURAGED to set maxFieldNum >= 64.
21  # So adjust at your risk!
22  maxFieldNum: 64
23  maxTaskNum: 1024 # max task number of proxy task queue
24  grpc:
25    serverMaxSendSize: 268435456
26    serverMaxRecvSize: 67108864
27    clientMaxSendSize: 268435456
28    clientMaxRecvSize: 67108864
29
30# Related configuration of queryCoord, used to manage topology and load balancing for the query nodes, and handoff from growing segments to sealed segments.
31queryCoord:
32  autoHandoff: true # Enable auto handoff
33  autoBalance: true # Enable auto balance
34  balancer: ScoreBasedBalancer # Balancer to use
35  overloadedMemoryThresholdPercentage: 90 # The threshold percentage that memory overload
36  balanceIntervalSeconds: 60
37  memoryUsageMaxDifferencePercentage: 30
38  checkInterval: 1000
39  channelTaskTimeout: 60000 # 1 minute
40  segmentTaskTimeout: 120000 # 2 minute
41  distPullInterval: 500
42  heartbeatAvailableInterval: 10000 # 10s, Only QueryNodes which fetched heartbeats within the duration are available
43  loadTimeoutSeconds: 600
44  checkHandoffInterval: 5000
45  grpc:
46    serverMaxSendSize: 536870912
47    serverMaxRecvSize: 268435456
48    clientMaxSendSize: 268435456
49    clientMaxRecvSize: 536870912
50
51# Related configuration of queryNode, used to run hybrid search between vector and scalar data.
52queryNode:
53  dataSync:
54    flowGraph:
55      maxQueueLength: 16 # Maximum length of task queue in flowgraph
56      maxParallelism: 1024 # Maximum number of tasks executed in parallel in the flowgraph
57  stats:
58    publishInterval: 1000 # Interval for querynode to report node information (milliseconds)
59  segcore:
60    cgoPoolSizeRatio: 2.0 # cgo pool size ratio to max read concurrency
61    knowhereThreadPoolNumRatio: 4
62    # Use more threads to make better use of SSD throughput in disk index.
63    # This parameter is only useful when enable-disk = true.
64    # And this value should be a number greater than 1 and less than 32.
65    chunkRows: 128 # The number of vectors in a chunk.
66    exprEvalBatchSize: 8192 # The batch size for executor get next
67    interimIndex: # build a vector temperate index for growing segment or binlog to accelerate search
68      enableIndex: true
69      nlist: 128 # segment index nlist
70      nprobe: 16 # nprobe to search segment, based on your accuracy requirement, must smaller than nlist
71      memExpansionRate: 1.15 # the ratio of building interim index memory usage to raw data
72  loadMemoryUsageFactor: 1 # The multiply factor of calculating the memory usage while loading segments
73  enableDisk: false # enable querynode load disk index, and search on disk index
74  maxDiskUsagePercentage: 95
75  grouping:
76    enabled: true
77    maxNQ: 1000
78    topKMergeRatio: 20
79  scheduler:
80    receiveChanSize: 10240
81    unsolvedQueueSize: 10240
82    # maxReadConcurrentRatio is the concurrency ratio of read task (search task and query task).
83    # Max read concurrency would be the value of runtime.NumCPU * maxReadConcurrentRatio.
84    # It defaults to 2.0, which means max read concurrency would be the value of runtime.NumCPU * 2.
85    # Max read concurrency must greater than or equal to 1, and less than or equal to runtime.NumCPU * 100.
86    # (0, 100]
87    maxReadConcurrentRatio: 1
88    cpuRatio: 10 # ratio used to estimate read task cpu usage.
89    maxTimestampLag: 86400
90    # read task schedule policy: fifo(by default), user-task-polling.
91    scheduleReadPolicy:
92      # fifo: A FIFO queue support the schedule.
93      # user-task-polling:
94      #     The user's tasks will be polled one by one and scheduled.
95      #     Scheduling is fair on task granularity.
96      #     The policy is based on the username for authentication.
97      #     And an empty username is considered the same user.
98      #     When there are no multi-users, the policy decay into FIFO
99      name: fifo
100      maxPendingTask: 10240
101      # user-task-polling configure:
102      taskQueueExpire: 60 # 1 min by default, expire time of inner user task queue since queue is empty.
103      enableCrossUserGrouping: false # false by default Enable Cross user grouping when using user-task-polling policy. (close it if task of any user can not merge others).
104      maxPendingTaskPerUser: 1024 # 50 by default, max pending task in scheduler per user.
105  grpc:
106    serverMaxSendSize: 536870912
107    serverMaxRecvSize: 268435456
108    clientMaxSendSize: 268435456
109    clientMaxRecvSize: 536870912
110
111indexCoord:
112  bindIndexNodeMode:
113    enable: false
114    withCred: false
115  segment:
116    minSegmentNumRowsToEnableIndex: 1024 # It's a threshold. When the segment num rows is less than this value, the segment will not be indexed
117
118indexNode:
119  scheduler:
120    buildParallel: 1
121  enableDisk: true # enable index node build disk vector index
122  maxDiskUsagePercentage: 95
123  grpc:
124    serverMaxSendSize: 536870912
125    serverMaxRecvSize: 268435456
126    clientMaxSendSize: 268435456
127    clientMaxRecvSize: 536870912
128
129dataCoord:
130  channel:
131    watchTimeoutInterval: 300 # Timeout on watching channels (in seconds). Datanode tickler update watch progress will reset timeout timer.
132    balanceSilentDuration: 300 # The duration before the channelBalancer on datacoord to run
133    balanceInterval: 360 #The interval for the channelBalancer on datacoord to check balance status
134  segment:
135    maxSize: 1024 # Maximum size of a segment in MB
136    diskSegmentMaxSize: 2048 # Maximum size of a segment in MB for collection which has Disk index
137    sealProportion: 0.12
138    # The time of the assignment expiration in ms
139    # Warning! this parameter is an expert variable and closely related to data integrity. Without specific
140    # target and solid understanding of the scenarios, it should not be changed. If it's necessary to alter
141    # this parameter, make sure that the newly changed value is larger than the previous value used before restart
142    # otherwise there could be a large possibility of data loss
143    assignmentExpiration: 2000
144    maxLife: 86400 # The max lifetime of segment in seconds, 24*60*60
145    # If a segment didn't accept dml records in maxIdleTime and the size of segment is greater than
146    # minSizeFromIdleToSealed, Milvus will automatically seal it.
147    # The max idle time of segment in seconds, 10*60.
148    maxIdleTime: 600
149    minSizeFromIdleToSealed: 16 # The min size in MB of segment which can be idle from sealed.
150    # The max number of binlog file for one segment, the segment will be sealed if
151    # the number of binlog file reaches to max value.
152    maxBinlogFileNumber: 32
153    smallProportion: 0.5 # The segment is considered as "small segment" when its # of rows is smaller than
154    # (smallProportion * segment max # of rows).
155    # A compaction will happen on small segments if the segment after compaction will have
156    compactableProportion: 0.85
157    # over (compactableProportion * segment max # of rows) rows.
158    # MUST BE GREATER THAN OR EQUAL TO <smallProportion>!!!
159    # During compaction, the size of segment # of rows is able to exceed segment max # of rows by (expansionRate-1) * 100%.
160    expansionRate: 1.25
161    # Whether to enable levelzero segment
162    enableLevelZero: false
163  enableCompaction: true # Enable data segment compaction
164  compaction:
165    enableAutoCompaction: true
166    rpcTimeout: 10 # compaction rpc request timeout in seconds
167    maxParallelTaskNum: 10 # max parallel compaction task number
168    indexBasedCompaction: true
169
170    levelzero:
171      forceTrigger:
172        minSize: 8 # The minmum size in MB to force trigger a LevelZero Compaction
173        deltalogMinNum: 10 # the minimum number of deltalog files to force trigger a LevelZero Compaction
174
175  enableGarbageCollection: true
176  gc:
177    interval: 3600 # gc interval in seconds
178    missingTolerance: 3600 # file meta missing tolerance duration in seconds, 3600
179    dropTolerance: 10800 # file belongs to dropped entity tolerance duration in seconds. 10800
180  enableActiveStandby: false
181  grpc:
182    serverMaxSendSize: 536870912
183    serverMaxRecvSize: 268435456
184    clientMaxSendSize: 268435456
185    clientMaxRecvSize: 536870912
186
187dataNode:
188  dataSync:
189    flowGraph:
190      maxQueueLength: 16 # Maximum length of task queue in flowgraph
191      maxParallelism: 1024 # Maximum number of tasks executed in parallel in the flowgraph
192    maxParallelSyncMgrTasks: 256 #The max concurrent sync task number of datanode sync mgr globally 
193    skipMode:
194      # when there are only timetick msg in flowgraph for a while (longer than coldTime),
195      # flowGraph will turn on skip mode to skip most timeticks to reduce cost, especially there are a lot of channels
196      enable: true
197      skipNum: 4
198      coldTime: 60
199  segment:
200    insertBufSize: 16777216 # Max buffer size to flush for a single segment.
201    deleteBufBytes: 67108864 # Max buffer size to flush del for a single channel
202    syncPeriod: 600 # The period to sync segments if buffer is not empty.
203  # can specify ip for example
204  # ip: 127.0.0.1
205  grpc:
206    serverMaxSendSize: 536870912
207    serverMaxRecvSize: 268435456
208    clientMaxSendSize: 268435456
209    clientMaxRecvSize: 536870912
210  memory:
211    forceSyncEnable: true # `true` to force sync if memory usage is too high
212    forceSyncSegmentNum: 1 # number of segments to sync, segments with top largest buffer will be synced.
213    watermarkStandalone: 0.2 # memory watermark for standalone, upon reaching this watermark, segments will be synced.
214    watermarkCluster: 0.5 # memory watermark for cluster, upon reaching this watermark, segments will be synced.
215  timetick:
216    byRPC: true
217  channel:
218    # specify the size of global work pool of all channels
219    # if this parameter <= 0, will set it as the maximum number of CPUs that can be executing
220    # suggest to set it bigger on large collection numbers to avoid blocking
221    workPoolSize: -1
222    # specify the size of global work pool for channel checkpoint updating
223    # if this parameter <= 0, will set it as 1000
224    # suggest to set it bigger on large collection numbers to avoid blocking
225    updateChannelCheckpointMaxParallel: 1000
226
227grpc:
228  client:
229    compressionEnabled: false
230    dialTimeout: 200
231    keepAliveTime: 10000
232    keepAliveTimeout: 20000
233    maxMaxAttempts: 10
234    initialBackOff: 0.2 # seconds
235    maxBackoff: 10 # seconds
236    
237quotaAndLimits:
238  enabled: true # `true` to enable quota and limits, `false` to disable.
239  limits:
240    maxCollectionNum: 65536
241    maxCollectionNumPerDB: 65536
242  # quotaCenterCollectInterval is the time interval that quotaCenter
243  # collects metrics from Proxies, Query cluster and Data cluster.
244  # seconds, (0 ~ 65536)
245  quotaCenterCollectInterval: 3
246  ddl:
247    enabled: false
248    collectionRate: -1 # qps, default no limit, rate for CreateCollection, DropCollection, LoadCollection, ReleaseCollection
249    partitionRate: -1 # qps, default no limit, rate for CreatePartition, DropPartition, LoadPartition, ReleasePartition
250  indexRate:
251    enabled: false
252    max: -1 # qps, default no limit, rate for CreateIndex, DropIndex
253  flushRate:
254    enabled: false
255    max: -1 # qps, default no limit, rate for flush
256  compactionRate:
257    enabled: false
258    max: -1 # qps, default no limit, rate for manualCompaction
259  dml:
260    # dml limit rates, default no limit.
261    # The maximum rate will not be greater than max.
262    enabled: false
263    insertRate:
264      collection:
265        max: -1 # MB/s, default no limit
266      max: -1 # MB/s, default no limit
267    upsertRate:
268      collection:
269        max: -1 # MB/s, default no limit
270      max: -1 # MB/s, default no limit
271    deleteRate:
272      collection:
273        max: -1 # MB/s, default no limit
274      max: -1 # MB/s, default no limit
275    bulkLoadRate:
276      collection:
277        max: -1 # MB/s, default no limit, not support yet. TODO: limit bulkLoad rate
278      max: -1 # MB/s, default no limit, not support yet. TODO: limit bulkLoad rate
279  dql:
280    # dql limit rates, default no limit.
281    # The maximum rate will not be greater than max.
282    enabled: false
283    searchRate:
284      collection:
285        max: -1 # vps (vectors per second), default no limit
286      max: -1 # vps (vectors per second), default no limit
287    queryRate:
288      collection:
289        max: -1 # qps, default no limit
290      max: -1 # qps, default no limit
291  limitWriting:
292    # forceDeny false means dml requests are allowed (except for some
293    # specific conditions, such as memory of nodes to water marker), true means always reject all dml requests.
294    forceDeny: false
295    ttProtection:
296      enabled: false
297      # maxTimeTickDelay indicates the backpressure for DML Operations.
298      # DML rates would be reduced according to the ratio of time tick delay to maxTimeTickDelay,
299      # if time tick delay is greater than maxTimeTickDelay, all DML requests would be rejected.
300      # seconds
301      maxTimeTickDelay: 300
302    memProtection:
303      # When memory usage > memoryHighWaterLevel, all dml requests would be rejected;
304      # When memoryLowWaterLevel < memory usage < memoryHighWaterLevel, reduce the dml rate;
305      # When memory usage < memoryLowWaterLevel, no action.
306      enabled: true
307      dataNodeMemoryLowWaterLevel: 0.85 # (0, 1], memoryLowWaterLevel in DataNodes
308      dataNodeMemoryHighWaterLevel: 0.95 # (0, 1], memoryHighWaterLevel in DataNodes
309      queryNodeMemoryLowWaterLevel: 0.85 # (0, 1], memoryLowWaterLevel in QueryNodes
310      queryNodeMemoryHighWaterLevel: 0.95 # (0, 1], memoryHighWaterLevel in QueryNodes
311    growingSegmentsSizeProtection:
312      # No action will be taken if the growing segments size is less than the low watermark.
313      # When the growing segments size exceeds the low watermark, the dml rate will be reduced,
314      # but the rate will not be lower than `minRateRatio * dmlRate`.
315      enabled: false
316      minRateRatio: 0.5
317      lowWaterLevel: 0.2
318      highWaterLevel: 0.4
319    diskProtection:
320      enabled: true # When the total file size of object storage is greater than `diskQuota`, all dml requests would be rejected;
321      diskQuota: -1 # MB, (0, +inf), default no limit
322      diskQuotaPerCollection: -1 # MB, (0, +inf), default no limit
323  limitReading:
324    # forceDeny false means dql requests are allowed (except for some
325    # specific conditions, such as collection has been dropped), true means always reject all dql requests.
326    forceDeny: false
327    queueProtection:
328      enabled: false
329      # nqInQueueThreshold indicated that the system was under backpressure for Search/Query path.
330      # If NQ in any QueryNode's queue is greater than nqInQueueThreshold, search&query rates would gradually cool off
331      # until the NQ in queue no longer exceeds nqInQueueThreshold. We think of the NQ of query request as 1.
332      # int, default no limit
333      nqInQueueThreshold: -1
334      # queueLatencyThreshold indicated that the system was under backpressure for Search/Query path.
335      # If dql latency of queuing is greater than queueLatencyThreshold, search&query rates would gradually cool off
336      # until the latency of queuing no longer exceeds queueLatencyThreshold.
337      # The latency here refers to the averaged latency over a period of time.
338      # milliseconds, default no limit
339      queueLatencyThreshold: -1
340    resultProtection:
341      enabled: false
342      # maxReadResultRate indicated that the system was under backpressure for Search/Query path.
343      # If dql result rate is greater than maxReadResultRate, search&query rates would gradually cool off
344      # until the read result rate no longer exceeds maxReadResultRate.
345      # MB/s, default no limit
346      maxReadResultRate: -1
347    # colOffSpeed is the speed of search&query rates cool off.
348    # (0, 1]
349    coolOffSpeed: 0.9
评价此篇文章
有帮助没帮助
配置变更
实例监控
向量数据库 VectorDB

修改实例配置

概述

前提条件

操作步骤