Google Cloud Native is in preview. Google Cloud Classic is fully supported.
Google Cloud Native v0.32.0 published on Wednesday, Nov 29, 2023 by Pulumi
google-native.aiplatform/v1.getIndexEndpoint
Explore with Pulumi AI
Google Cloud Native is in preview. Google Cloud Classic is fully supported.
Google Cloud Native v0.32.0 published on Wednesday, Nov 29, 2023 by Pulumi
Gets an IndexEndpoint.
Using getIndexEndpoint
Two invocation forms are available. The direct form accepts plain arguments and either blocks until the result value is available, or returns a Promise-wrapped result. The output form accepts Input-wrapped arguments and returns an Output-wrapped result.
function getIndexEndpoint(args: GetIndexEndpointArgs, opts?: InvokeOptions): Promise<GetIndexEndpointResult>
function getIndexEndpointOutput(args: GetIndexEndpointOutputArgs, opts?: InvokeOptions): Output<GetIndexEndpointResult>
def get_index_endpoint(index_endpoint_id: Optional[str] = None,
location: Optional[str] = None,
project: Optional[str] = None,
opts: Optional[InvokeOptions] = None) -> GetIndexEndpointResult
def get_index_endpoint_output(index_endpoint_id: Optional[pulumi.Input[str]] = None,
location: Optional[pulumi.Input[str]] = None,
project: Optional[pulumi.Input[str]] = None,
opts: Optional[InvokeOptions] = None) -> Output[GetIndexEndpointResult]
func LookupIndexEndpoint(ctx *Context, args *LookupIndexEndpointArgs, opts ...InvokeOption) (*LookupIndexEndpointResult, error)
func LookupIndexEndpointOutput(ctx *Context, args *LookupIndexEndpointOutputArgs, opts ...InvokeOption) LookupIndexEndpointResultOutput
> Note: This function is named LookupIndexEndpoint
in the Go SDK.
public static class GetIndexEndpoint
{
public static Task<GetIndexEndpointResult> InvokeAsync(GetIndexEndpointArgs args, InvokeOptions? opts = null)
public static Output<GetIndexEndpointResult> Invoke(GetIndexEndpointInvokeArgs args, InvokeOptions? opts = null)
}
public static CompletableFuture<GetIndexEndpointResult> getIndexEndpoint(GetIndexEndpointArgs args, InvokeOptions options)
// Output-based functions aren't available in Java yet
fn::invoke:
function: google-native:aiplatform/v1:getIndexEndpoint
arguments:
# arguments dictionary
The following arguments are supported:
- Index
Endpoint stringId - Location string
- Project string
- Index
Endpoint stringId - Location string
- Project string
- index
Endpoint StringId - location String
- project String
- index
Endpoint stringId - location string
- project string
- index_
endpoint_ strid - location str
- project str
- index
Endpoint StringId - location String
- project String
getIndexEndpoint Result
The following output properties are available:
- Create
Time string - Timestamp when this IndexEndpoint was created.
- Deployed
Indexes List<Pulumi.Google Native. Aiplatform. V1. Outputs. Google Cloud Aiplatform V1Deployed Index Response> - The indexes deployed in this endpoint.
- Description string
- The description of the IndexEndpoint.
- Display
Name string - The display name of the IndexEndpoint. The name can be up to 128 characters long and can consist of any UTF-8 characters.
- Enable
Private boolService Connect - Optional. Deprecated: If true, expose the IndexEndpoint via private service connect. Only one of the fields, network or enable_private_service_connect, can be set.
- Encryption
Spec Pulumi.Google Native. Aiplatform. V1. Outputs. Google Cloud Aiplatform V1Encryption Spec Response - Immutable. Customer-managed encryption key spec for an IndexEndpoint. If set, this IndexEndpoint and all sub-resources of this IndexEndpoint will be secured by this key.
- Etag string
- Used to perform consistent read-modify-write updates. If not set, a blind "overwrite" update happens.
- Labels Dictionary<string, string>
- The labels with user-defined metadata to organize your IndexEndpoints. Label keys and values can be no longer than 64 characters (Unicode codepoints), can only contain lowercase letters, numeric characters, underscores and dashes. International characters are allowed. See https://goo.gl/xmQnxf for more information and examples of labels.
- Name string
- The resource name of the IndexEndpoint.
- Network string
- Optional. The full name of the Google Compute Engine network to which the IndexEndpoint should be peered. Private services access must already be configured for the network. If left unspecified, the Endpoint is not peered with any network. network and private_service_connect_config are mutually exclusive. Format:
projects/{project}/global/networks/{network}
. Where {project} is a project number, as in '12345', and {network} is network name. - Private
Service Pulumi.Connect Config Google Native. Aiplatform. V1. Outputs. Google Cloud Aiplatform V1Private Service Connect Config Response - Optional. Configuration for private service connect. network and private_service_connect_config are mutually exclusive.
- Public
Endpoint stringDomain Name - If public_endpoint_enabled is true, this field will be populated with the domain name to use for this index endpoint.
- Public
Endpoint boolEnabled - Optional. If true, the deployed index will be accessible through public endpoint.
- Update
Time string - Timestamp when this IndexEndpoint was last updated. This timestamp is not updated when the endpoint's DeployedIndexes are updated, e.g. due to updates of the original Indexes they are the deployments of.
- Create
Time string - Timestamp when this IndexEndpoint was created.
- Deployed
Indexes []GoogleCloud Aiplatform V1Deployed Index Response - The indexes deployed in this endpoint.
- Description string
- The description of the IndexEndpoint.
- Display
Name string - The display name of the IndexEndpoint. The name can be up to 128 characters long and can consist of any UTF-8 characters.
- Enable
Private boolService Connect - Optional. Deprecated: If true, expose the IndexEndpoint via private service connect. Only one of the fields, network or enable_private_service_connect, can be set.
- Encryption
Spec GoogleCloud Aiplatform V1Encryption Spec Response - Immutable. Customer-managed encryption key spec for an IndexEndpoint. If set, this IndexEndpoint and all sub-resources of this IndexEndpoint will be secured by this key.
- Etag string
- Used to perform consistent read-modify-write updates. If not set, a blind "overwrite" update happens.
- Labels map[string]string
- The labels with user-defined metadata to organize your IndexEndpoints. Label keys and values can be no longer than 64 characters (Unicode codepoints), can only contain lowercase letters, numeric characters, underscores and dashes. International characters are allowed. See https://goo.gl/xmQnxf for more information and examples of labels.
- Name string
- The resource name of the IndexEndpoint.
- Network string
- Optional. The full name of the Google Compute Engine network to which the IndexEndpoint should be peered. Private services access must already be configured for the network. If left unspecified, the Endpoint is not peered with any network. network and private_service_connect_config are mutually exclusive. Format:
projects/{project}/global/networks/{network}
. Where {project} is a project number, as in '12345', and {network} is network name. - Private
Service GoogleConnect Config Cloud Aiplatform V1Private Service Connect Config Response - Optional. Configuration for private service connect. network and private_service_connect_config are mutually exclusive.
- Public
Endpoint stringDomain Name - If public_endpoint_enabled is true, this field will be populated with the domain name to use for this index endpoint.
- Public
Endpoint boolEnabled - Optional. If true, the deployed index will be accessible through public endpoint.
- Update
Time string - Timestamp when this IndexEndpoint was last updated. This timestamp is not updated when the endpoint's DeployedIndexes are updated, e.g. due to updates of the original Indexes they are the deployments of.
- create
Time String - Timestamp when this IndexEndpoint was created.
- deployed
Indexes List<GoogleCloud Aiplatform V1Deployed Index Response> - The indexes deployed in this endpoint.
- description String
- The description of the IndexEndpoint.
- display
Name String - The display name of the IndexEndpoint. The name can be up to 128 characters long and can consist of any UTF-8 characters.
- enable
Private BooleanService Connect - Optional. Deprecated: If true, expose the IndexEndpoint via private service connect. Only one of the fields, network or enable_private_service_connect, can be set.
- encryption
Spec GoogleCloud Aiplatform V1Encryption Spec Response - Immutable. Customer-managed encryption key spec for an IndexEndpoint. If set, this IndexEndpoint and all sub-resources of this IndexEndpoint will be secured by this key.
- etag String
- Used to perform consistent read-modify-write updates. If not set, a blind "overwrite" update happens.
- labels Map<String,String>
- The labels with user-defined metadata to organize your IndexEndpoints. Label keys and values can be no longer than 64 characters (Unicode codepoints), can only contain lowercase letters, numeric characters, underscores and dashes. International characters are allowed. See https://goo.gl/xmQnxf for more information and examples of labels.
- name String
- The resource name of the IndexEndpoint.
- network String
- Optional. The full name of the Google Compute Engine network to which the IndexEndpoint should be peered. Private services access must already be configured for the network. If left unspecified, the Endpoint is not peered with any network. network and private_service_connect_config are mutually exclusive. Format:
projects/{project}/global/networks/{network}
. Where {project} is a project number, as in '12345', and {network} is network name. - private
Service GoogleConnect Config Cloud Aiplatform V1Private Service Connect Config Response - Optional. Configuration for private service connect. network and private_service_connect_config are mutually exclusive.
- public
Endpoint StringDomain Name - If public_endpoint_enabled is true, this field will be populated with the domain name to use for this index endpoint.
- public
Endpoint BooleanEnabled - Optional. If true, the deployed index will be accessible through public endpoint.
- update
Time String - Timestamp when this IndexEndpoint was last updated. This timestamp is not updated when the endpoint's DeployedIndexes are updated, e.g. due to updates of the original Indexes they are the deployments of.
- create
Time string - Timestamp when this IndexEndpoint was created.
- deployed
Indexes GoogleCloud Aiplatform V1Deployed Index Response[] - The indexes deployed in this endpoint.
- description string
- The description of the IndexEndpoint.
- display
Name string - The display name of the IndexEndpoint. The name can be up to 128 characters long and can consist of any UTF-8 characters.
- enable
Private booleanService Connect - Optional. Deprecated: If true, expose the IndexEndpoint via private service connect. Only one of the fields, network or enable_private_service_connect, can be set.
- encryption
Spec GoogleCloud Aiplatform V1Encryption Spec Response - Immutable. Customer-managed encryption key spec for an IndexEndpoint. If set, this IndexEndpoint and all sub-resources of this IndexEndpoint will be secured by this key.
- etag string
- Used to perform consistent read-modify-write updates. If not set, a blind "overwrite" update happens.
- labels {[key: string]: string}
- The labels with user-defined metadata to organize your IndexEndpoints. Label keys and values can be no longer than 64 characters (Unicode codepoints), can only contain lowercase letters, numeric characters, underscores and dashes. International characters are allowed. See https://goo.gl/xmQnxf for more information and examples of labels.
- name string
- The resource name of the IndexEndpoint.
- network string
- Optional. The full name of the Google Compute Engine network to which the IndexEndpoint should be peered. Private services access must already be configured for the network. If left unspecified, the Endpoint is not peered with any network. network and private_service_connect_config are mutually exclusive. Format:
projects/{project}/global/networks/{network}
. Where {project} is a project number, as in '12345', and {network} is network name. - private
Service GoogleConnect Config Cloud Aiplatform V1Private Service Connect Config Response - Optional. Configuration for private service connect. network and private_service_connect_config are mutually exclusive.
- public
Endpoint stringDomain Name - If public_endpoint_enabled is true, this field will be populated with the domain name to use for this index endpoint.
- public
Endpoint booleanEnabled - Optional. If true, the deployed index will be accessible through public endpoint.
- update
Time string - Timestamp when this IndexEndpoint was last updated. This timestamp is not updated when the endpoint's DeployedIndexes are updated, e.g. due to updates of the original Indexes they are the deployments of.
- create_
time str - Timestamp when this IndexEndpoint was created.
- deployed_
indexes Sequence[GoogleCloud Aiplatform V1Deployed Index Response] - The indexes deployed in this endpoint.
- description str
- The description of the IndexEndpoint.
- display_
name str - The display name of the IndexEndpoint. The name can be up to 128 characters long and can consist of any UTF-8 characters.
- enable_
private_ boolservice_ connect - Optional. Deprecated: If true, expose the IndexEndpoint via private service connect. Only one of the fields, network or enable_private_service_connect, can be set.
- encryption_
spec GoogleCloud Aiplatform V1Encryption Spec Response - Immutable. Customer-managed encryption key spec for an IndexEndpoint. If set, this IndexEndpoint and all sub-resources of this IndexEndpoint will be secured by this key.
- etag str
- Used to perform consistent read-modify-write updates. If not set, a blind "overwrite" update happens.
- labels Mapping[str, str]
- The labels with user-defined metadata to organize your IndexEndpoints. Label keys and values can be no longer than 64 characters (Unicode codepoints), can only contain lowercase letters, numeric characters, underscores and dashes. International characters are allowed. See https://goo.gl/xmQnxf for more information and examples of labels.
- name str
- The resource name of the IndexEndpoint.
- network str
- Optional. The full name of the Google Compute Engine network to which the IndexEndpoint should be peered. Private services access must already be configured for the network. If left unspecified, the Endpoint is not peered with any network. network and private_service_connect_config are mutually exclusive. Format:
projects/{project}/global/networks/{network}
. Where {project} is a project number, as in '12345', and {network} is network name. - private_
service_ Googleconnect_ config Cloud Aiplatform V1Private Service Connect Config Response - Optional. Configuration for private service connect. network and private_service_connect_config are mutually exclusive.
- public_
endpoint_ strdomain_ name - If public_endpoint_enabled is true, this field will be populated with the domain name to use for this index endpoint.
- public_
endpoint_ boolenabled - Optional. If true, the deployed index will be accessible through public endpoint.
- update_
time str - Timestamp when this IndexEndpoint was last updated. This timestamp is not updated when the endpoint's DeployedIndexes are updated, e.g. due to updates of the original Indexes they are the deployments of.
- create
Time String - Timestamp when this IndexEndpoint was created.
- deployed
Indexes List<Property Map> - The indexes deployed in this endpoint.
- description String
- The description of the IndexEndpoint.
- display
Name String - The display name of the IndexEndpoint. The name can be up to 128 characters long and can consist of any UTF-8 characters.
- enable
Private BooleanService Connect - Optional. Deprecated: If true, expose the IndexEndpoint via private service connect. Only one of the fields, network or enable_private_service_connect, can be set.
- encryption
Spec Property Map - Immutable. Customer-managed encryption key spec for an IndexEndpoint. If set, this IndexEndpoint and all sub-resources of this IndexEndpoint will be secured by this key.
- etag String
- Used to perform consistent read-modify-write updates. If not set, a blind "overwrite" update happens.
- labels Map<String>
- The labels with user-defined metadata to organize your IndexEndpoints. Label keys and values can be no longer than 64 characters (Unicode codepoints), can only contain lowercase letters, numeric characters, underscores and dashes. International characters are allowed. See https://goo.gl/xmQnxf for more information and examples of labels.
- name String
- The resource name of the IndexEndpoint.
- network String
- Optional. The full name of the Google Compute Engine network to which the IndexEndpoint should be peered. Private services access must already be configured for the network. If left unspecified, the Endpoint is not peered with any network. network and private_service_connect_config are mutually exclusive. Format:
projects/{project}/global/networks/{network}
. Where {project} is a project number, as in '12345', and {network} is network name. - private
Service Property MapConnect Config - Optional. Configuration for private service connect. network and private_service_connect_config are mutually exclusive.
- public
Endpoint StringDomain Name - If public_endpoint_enabled is true, this field will be populated with the domain name to use for this index endpoint.
- public
Endpoint BooleanEnabled - Optional. If true, the deployed index will be accessible through public endpoint.
- update
Time String - Timestamp when this IndexEndpoint was last updated. This timestamp is not updated when the endpoint's DeployedIndexes are updated, e.g. due to updates of the original Indexes they are the deployments of.
Supporting Types
GoogleCloudAiplatformV1AutomaticResourcesResponse
- Max
Replica intCount - Immutable. The maximum number of replicas this DeployedModel may be deployed on when the traffic against it increases. If the requested value is too large, the deployment will error, but if deployment succeeds then the ability to scale the model to that many replicas is guaranteed (barring service outages). If traffic against the DeployedModel increases beyond what its replicas at maximum may handle, a portion of the traffic will be dropped. If this value is not provided, a no upper bound for scaling under heavy traffic will be assume, though Vertex AI may be unable to scale beyond certain replica number.
- Min
Replica intCount - Immutable. The minimum number of replicas this DeployedModel will be always deployed on. If traffic against it increases, it may dynamically be deployed onto more replicas up to max_replica_count, and as traffic decreases, some of these extra replicas may be freed. If the requested value is too large, the deployment will error.
- Max
Replica intCount - Immutable. The maximum number of replicas this DeployedModel may be deployed on when the traffic against it increases. If the requested value is too large, the deployment will error, but if deployment succeeds then the ability to scale the model to that many replicas is guaranteed (barring service outages). If traffic against the DeployedModel increases beyond what its replicas at maximum may handle, a portion of the traffic will be dropped. If this value is not provided, a no upper bound for scaling under heavy traffic will be assume, though Vertex AI may be unable to scale beyond certain replica number.
- Min
Replica intCount - Immutable. The minimum number of replicas this DeployedModel will be always deployed on. If traffic against it increases, it may dynamically be deployed onto more replicas up to max_replica_count, and as traffic decreases, some of these extra replicas may be freed. If the requested value is too large, the deployment will error.
- max
Replica IntegerCount - Immutable. The maximum number of replicas this DeployedModel may be deployed on when the traffic against it increases. If the requested value is too large, the deployment will error, but if deployment succeeds then the ability to scale the model to that many replicas is guaranteed (barring service outages). If traffic against the DeployedModel increases beyond what its replicas at maximum may handle, a portion of the traffic will be dropped. If this value is not provided, a no upper bound for scaling under heavy traffic will be assume, though Vertex AI may be unable to scale beyond certain replica number.
- min
Replica IntegerCount - Immutable. The minimum number of replicas this DeployedModel will be always deployed on. If traffic against it increases, it may dynamically be deployed onto more replicas up to max_replica_count, and as traffic decreases, some of these extra replicas may be freed. If the requested value is too large, the deployment will error.
- max
Replica numberCount - Immutable. The maximum number of replicas this DeployedModel may be deployed on when the traffic against it increases. If the requested value is too large, the deployment will error, but if deployment succeeds then the ability to scale the model to that many replicas is guaranteed (barring service outages). If traffic against the DeployedModel increases beyond what its replicas at maximum may handle, a portion of the traffic will be dropped. If this value is not provided, a no upper bound for scaling under heavy traffic will be assume, though Vertex AI may be unable to scale beyond certain replica number.
- min
Replica numberCount - Immutable. The minimum number of replicas this DeployedModel will be always deployed on. If traffic against it increases, it may dynamically be deployed onto more replicas up to max_replica_count, and as traffic decreases, some of these extra replicas may be freed. If the requested value is too large, the deployment will error.
- max_
replica_ intcount - Immutable. The maximum number of replicas this DeployedModel may be deployed on when the traffic against it increases. If the requested value is too large, the deployment will error, but if deployment succeeds then the ability to scale the model to that many replicas is guaranteed (barring service outages). If traffic against the DeployedModel increases beyond what its replicas at maximum may handle, a portion of the traffic will be dropped. If this value is not provided, a no upper bound for scaling under heavy traffic will be assume, though Vertex AI may be unable to scale beyond certain replica number.
- min_
replica_ intcount - Immutable. The minimum number of replicas this DeployedModel will be always deployed on. If traffic against it increases, it may dynamically be deployed onto more replicas up to max_replica_count, and as traffic decreases, some of these extra replicas may be freed. If the requested value is too large, the deployment will error.
- max
Replica NumberCount - Immutable. The maximum number of replicas this DeployedModel may be deployed on when the traffic against it increases. If the requested value is too large, the deployment will error, but if deployment succeeds then the ability to scale the model to that many replicas is guaranteed (barring service outages). If traffic against the DeployedModel increases beyond what its replicas at maximum may handle, a portion of the traffic will be dropped. If this value is not provided, a no upper bound for scaling under heavy traffic will be assume, though Vertex AI may be unable to scale beyond certain replica number.
- min
Replica NumberCount - Immutable. The minimum number of replicas this DeployedModel will be always deployed on. If traffic against it increases, it may dynamically be deployed onto more replicas up to max_replica_count, and as traffic decreases, some of these extra replicas may be freed. If the requested value is too large, the deployment will error.
GoogleCloudAiplatformV1AutoscalingMetricSpecResponse
- Metric
Name string - The resource metric name. Supported metrics: * For Online Prediction: *
aiplatform.googleapis.com/prediction/online/accelerator/duty_cycle
*aiplatform.googleapis.com/prediction/online/cpu/utilization
- Target int
- The target resource utilization in percentage (1% - 100%) for the given metric; once the real usage deviates from the target by a certain percentage, the machine replicas change. The default value is 60 (representing 60%) if not provided.
- Metric
Name string - The resource metric name. Supported metrics: * For Online Prediction: *
aiplatform.googleapis.com/prediction/online/accelerator/duty_cycle
*aiplatform.googleapis.com/prediction/online/cpu/utilization
- Target int
- The target resource utilization in percentage (1% - 100%) for the given metric; once the real usage deviates from the target by a certain percentage, the machine replicas change. The default value is 60 (representing 60%) if not provided.
- metric
Name String - The resource metric name. Supported metrics: * For Online Prediction: *
aiplatform.googleapis.com/prediction/online/accelerator/duty_cycle
*aiplatform.googleapis.com/prediction/online/cpu/utilization
- target Integer
- The target resource utilization in percentage (1% - 100%) for the given metric; once the real usage deviates from the target by a certain percentage, the machine replicas change. The default value is 60 (representing 60%) if not provided.
- metric
Name string - The resource metric name. Supported metrics: * For Online Prediction: *
aiplatform.googleapis.com/prediction/online/accelerator/duty_cycle
*aiplatform.googleapis.com/prediction/online/cpu/utilization
- target number
- The target resource utilization in percentage (1% - 100%) for the given metric; once the real usage deviates from the target by a certain percentage, the machine replicas change. The default value is 60 (representing 60%) if not provided.
- metric_
name str - The resource metric name. Supported metrics: * For Online Prediction: *
aiplatform.googleapis.com/prediction/online/accelerator/duty_cycle
*aiplatform.googleapis.com/prediction/online/cpu/utilization
- target int
- The target resource utilization in percentage (1% - 100%) for the given metric; once the real usage deviates from the target by a certain percentage, the machine replicas change. The default value is 60 (representing 60%) if not provided.
- metric
Name String - The resource metric name. Supported metrics: * For Online Prediction: *
aiplatform.googleapis.com/prediction/online/accelerator/duty_cycle
*aiplatform.googleapis.com/prediction/online/cpu/utilization
- target Number
- The target resource utilization in percentage (1% - 100%) for the given metric; once the real usage deviates from the target by a certain percentage, the machine replicas change. The default value is 60 (representing 60%) if not provided.
GoogleCloudAiplatformV1DedicatedResourcesResponse
- Autoscaling
Metric List<Pulumi.Specs Google Native. Aiplatform. V1. Inputs. Google Cloud Aiplatform V1Autoscaling Metric Spec Response> - Immutable. The metric specifications that overrides a resource utilization metric (CPU utilization, accelerator's duty cycle, and so on) target value (default to 60 if not set). At most one entry is allowed per metric. If machine_spec.accelerator_count is above 0, the autoscaling will be based on both CPU utilization and accelerator's duty cycle metrics and scale up when either metrics exceeds its target value while scale down if both metrics are under their target value. The default target value is 60 for both metrics. If machine_spec.accelerator_count is 0, the autoscaling will be based on CPU utilization metric only with default target value 60 if not explicitly set. For example, in the case of Online Prediction, if you want to override target CPU utilization to 80, you should set autoscaling_metric_specs.metric_name to
aiplatform.googleapis.com/prediction/online/cpu/utilization
and autoscaling_metric_specs.target to80
. - Machine
Spec Pulumi.Google Native. Aiplatform. V1. Inputs. Google Cloud Aiplatform V1Machine Spec Response - Immutable. The specification of a single machine used by the prediction.
- Max
Replica intCount - Immutable. The maximum number of replicas this DeployedModel may be deployed on when the traffic against it increases. If the requested value is too large, the deployment will error, but if deployment succeeds then the ability to scale the model to that many replicas is guaranteed (barring service outages). If traffic against the DeployedModel increases beyond what its replicas at maximum may handle, a portion of the traffic will be dropped. If this value is not provided, will use min_replica_count as the default value. The value of this field impacts the charge against Vertex CPU and GPU quotas. Specifically, you will be charged for (max_replica_count * number of cores in the selected machine type) and (max_replica_count * number of GPUs per replica in the selected machine type).
- Min
Replica intCount - Immutable. The minimum number of machine replicas this DeployedModel will be always deployed on. This value must be greater than or equal to 1. If traffic against the DeployedModel increases, it may dynamically be deployed onto more replicas, and as traffic decreases, some of these extra replicas may be freed.
- Autoscaling
Metric []GoogleSpecs Cloud Aiplatform V1Autoscaling Metric Spec Response - Immutable. The metric specifications that overrides a resource utilization metric (CPU utilization, accelerator's duty cycle, and so on) target value (default to 60 if not set). At most one entry is allowed per metric. If machine_spec.accelerator_count is above 0, the autoscaling will be based on both CPU utilization and accelerator's duty cycle metrics and scale up when either metrics exceeds its target value while scale down if both metrics are under their target value. The default target value is 60 for both metrics. If machine_spec.accelerator_count is 0, the autoscaling will be based on CPU utilization metric only with default target value 60 if not explicitly set. For example, in the case of Online Prediction, if you want to override target CPU utilization to 80, you should set autoscaling_metric_specs.metric_name to
aiplatform.googleapis.com/prediction/online/cpu/utilization
and autoscaling_metric_specs.target to80
. - Machine
Spec GoogleCloud Aiplatform V1Machine Spec Response - Immutable. The specification of a single machine used by the prediction.
- Max
Replica intCount - Immutable. The maximum number of replicas this DeployedModel may be deployed on when the traffic against it increases. If the requested value is too large, the deployment will error, but if deployment succeeds then the ability to scale the model to that many replicas is guaranteed (barring service outages). If traffic against the DeployedModel increases beyond what its replicas at maximum may handle, a portion of the traffic will be dropped. If this value is not provided, will use min_replica_count as the default value. The value of this field impacts the charge against Vertex CPU and GPU quotas. Specifically, you will be charged for (max_replica_count * number of cores in the selected machine type) and (max_replica_count * number of GPUs per replica in the selected machine type).
- Min
Replica intCount - Immutable. The minimum number of machine replicas this DeployedModel will be always deployed on. This value must be greater than or equal to 1. If traffic against the DeployedModel increases, it may dynamically be deployed onto more replicas, and as traffic decreases, some of these extra replicas may be freed.
- autoscaling
Metric List<GoogleSpecs Cloud Aiplatform V1Autoscaling Metric Spec Response> - Immutable. The metric specifications that overrides a resource utilization metric (CPU utilization, accelerator's duty cycle, and so on) target value (default to 60 if not set). At most one entry is allowed per metric. If machine_spec.accelerator_count is above 0, the autoscaling will be based on both CPU utilization and accelerator's duty cycle metrics and scale up when either metrics exceeds its target value while scale down if both metrics are under their target value. The default target value is 60 for both metrics. If machine_spec.accelerator_count is 0, the autoscaling will be based on CPU utilization metric only with default target value 60 if not explicitly set. For example, in the case of Online Prediction, if you want to override target CPU utilization to 80, you should set autoscaling_metric_specs.metric_name to
aiplatform.googleapis.com/prediction/online/cpu/utilization
and autoscaling_metric_specs.target to80
. - machine
Spec GoogleCloud Aiplatform V1Machine Spec Response - Immutable. The specification of a single machine used by the prediction.
- max
Replica IntegerCount - Immutable. The maximum number of replicas this DeployedModel may be deployed on when the traffic against it increases. If the requested value is too large, the deployment will error, but if deployment succeeds then the ability to scale the model to that many replicas is guaranteed (barring service outages). If traffic against the DeployedModel increases beyond what its replicas at maximum may handle, a portion of the traffic will be dropped. If this value is not provided, will use min_replica_count as the default value. The value of this field impacts the charge against Vertex CPU and GPU quotas. Specifically, you will be charged for (max_replica_count * number of cores in the selected machine type) and (max_replica_count * number of GPUs per replica in the selected machine type).
- min
Replica IntegerCount - Immutable. The minimum number of machine replicas this DeployedModel will be always deployed on. This value must be greater than or equal to 1. If traffic against the DeployedModel increases, it may dynamically be deployed onto more replicas, and as traffic decreases, some of these extra replicas may be freed.
- autoscaling
Metric GoogleSpecs Cloud Aiplatform V1Autoscaling Metric Spec Response[] - Immutable. The metric specifications that overrides a resource utilization metric (CPU utilization, accelerator's duty cycle, and so on) target value (default to 60 if not set). At most one entry is allowed per metric. If machine_spec.accelerator_count is above 0, the autoscaling will be based on both CPU utilization and accelerator's duty cycle metrics and scale up when either metrics exceeds its target value while scale down if both metrics are under their target value. The default target value is 60 for both metrics. If machine_spec.accelerator_count is 0, the autoscaling will be based on CPU utilization metric only with default target value 60 if not explicitly set. For example, in the case of Online Prediction, if you want to override target CPU utilization to 80, you should set autoscaling_metric_specs.metric_name to
aiplatform.googleapis.com/prediction/online/cpu/utilization
and autoscaling_metric_specs.target to80
. - machine
Spec GoogleCloud Aiplatform V1Machine Spec Response - Immutable. The specification of a single machine used by the prediction.
- max
Replica numberCount - Immutable. The maximum number of replicas this DeployedModel may be deployed on when the traffic against it increases. If the requested value is too large, the deployment will error, but if deployment succeeds then the ability to scale the model to that many replicas is guaranteed (barring service outages). If traffic against the DeployedModel increases beyond what its replicas at maximum may handle, a portion of the traffic will be dropped. If this value is not provided, will use min_replica_count as the default value. The value of this field impacts the charge against Vertex CPU and GPU quotas. Specifically, you will be charged for (max_replica_count * number of cores in the selected machine type) and (max_replica_count * number of GPUs per replica in the selected machine type).
- min
Replica numberCount - Immutable. The minimum number of machine replicas this DeployedModel will be always deployed on. This value must be greater than or equal to 1. If traffic against the DeployedModel increases, it may dynamically be deployed onto more replicas, and as traffic decreases, some of these extra replicas may be freed.
- autoscaling_
metric_ Sequence[Googlespecs Cloud Aiplatform V1Autoscaling Metric Spec Response] - Immutable. The metric specifications that overrides a resource utilization metric (CPU utilization, accelerator's duty cycle, and so on) target value (default to 60 if not set). At most one entry is allowed per metric. If machine_spec.accelerator_count is above 0, the autoscaling will be based on both CPU utilization and accelerator's duty cycle metrics and scale up when either metrics exceeds its target value while scale down if both metrics are under their target value. The default target value is 60 for both metrics. If machine_spec.accelerator_count is 0, the autoscaling will be based on CPU utilization metric only with default target value 60 if not explicitly set. For example, in the case of Online Prediction, if you want to override target CPU utilization to 80, you should set autoscaling_metric_specs.metric_name to
aiplatform.googleapis.com/prediction/online/cpu/utilization
and autoscaling_metric_specs.target to80
. - machine_
spec GoogleCloud Aiplatform V1Machine Spec Response - Immutable. The specification of a single machine used by the prediction.
- max_
replica_ intcount - Immutable. The maximum number of replicas this DeployedModel may be deployed on when the traffic against it increases. If the requested value is too large, the deployment will error, but if deployment succeeds then the ability to scale the model to that many replicas is guaranteed (barring service outages). If traffic against the DeployedModel increases beyond what its replicas at maximum may handle, a portion of the traffic will be dropped. If this value is not provided, will use min_replica_count as the default value. The value of this field impacts the charge against Vertex CPU and GPU quotas. Specifically, you will be charged for (max_replica_count * number of cores in the selected machine type) and (max_replica_count * number of GPUs per replica in the selected machine type).
- min_
replica_ intcount - Immutable. The minimum number of machine replicas this DeployedModel will be always deployed on. This value must be greater than or equal to 1. If traffic against the DeployedModel increases, it may dynamically be deployed onto more replicas, and as traffic decreases, some of these extra replicas may be freed.
- autoscaling
Metric List<Property Map>Specs - Immutable. The metric specifications that overrides a resource utilization metric (CPU utilization, accelerator's duty cycle, and so on) target value (default to 60 if not set). At most one entry is allowed per metric. If machine_spec.accelerator_count is above 0, the autoscaling will be based on both CPU utilization and accelerator's duty cycle metrics and scale up when either metrics exceeds its target value while scale down if both metrics are under their target value. The default target value is 60 for both metrics. If machine_spec.accelerator_count is 0, the autoscaling will be based on CPU utilization metric only with default target value 60 if not explicitly set. For example, in the case of Online Prediction, if you want to override target CPU utilization to 80, you should set autoscaling_metric_specs.metric_name to
aiplatform.googleapis.com/prediction/online/cpu/utilization
and autoscaling_metric_specs.target to80
. - machine
Spec Property Map - Immutable. The specification of a single machine used by the prediction.
- max
Replica NumberCount - Immutable. The maximum number of replicas this DeployedModel may be deployed on when the traffic against it increases. If the requested value is too large, the deployment will error, but if deployment succeeds then the ability to scale the model to that many replicas is guaranteed (barring service outages). If traffic against the DeployedModel increases beyond what its replicas at maximum may handle, a portion of the traffic will be dropped. If this value is not provided, will use min_replica_count as the default value. The value of this field impacts the charge against Vertex CPU and GPU quotas. Specifically, you will be charged for (max_replica_count * number of cores in the selected machine type) and (max_replica_count * number of GPUs per replica in the selected machine type).
- min
Replica NumberCount - Immutable. The minimum number of machine replicas this DeployedModel will be always deployed on. This value must be greater than or equal to 1. If traffic against the DeployedModel increases, it may dynamically be deployed onto more replicas, and as traffic decreases, some of these extra replicas may be freed.
GoogleCloudAiplatformV1DeployedIndexAuthConfigAuthProviderResponse
- Allowed
Issuers List<string> - A list of allowed JWT issuers. Each entry must be a valid Google service account, in the following format:
service-account-name@project-id.iam.gserviceaccount.com
- Audiences List<string>
- The list of JWT audiences. that are allowed to access. A JWT containing any of these audiences will be accepted.
- Allowed
Issuers []string - A list of allowed JWT issuers. Each entry must be a valid Google service account, in the following format:
service-account-name@project-id.iam.gserviceaccount.com
- Audiences []string
- The list of JWT audiences. that are allowed to access. A JWT containing any of these audiences will be accepted.
- allowed
Issuers List<String> - A list of allowed JWT issuers. Each entry must be a valid Google service account, in the following format:
service-account-name@project-id.iam.gserviceaccount.com
- audiences List<String>
- The list of JWT audiences. that are allowed to access. A JWT containing any of these audiences will be accepted.
- allowed
Issuers string[] - A list of allowed JWT issuers. Each entry must be a valid Google service account, in the following format:
service-account-name@project-id.iam.gserviceaccount.com
- audiences string[]
- The list of JWT audiences. that are allowed to access. A JWT containing any of these audiences will be accepted.
- allowed_
issuers Sequence[str] - A list of allowed JWT issuers. Each entry must be a valid Google service account, in the following format:
service-account-name@project-id.iam.gserviceaccount.com
- audiences Sequence[str]
- The list of JWT audiences. that are allowed to access. A JWT containing any of these audiences will be accepted.
- allowed
Issuers List<String> - A list of allowed JWT issuers. Each entry must be a valid Google service account, in the following format:
service-account-name@project-id.iam.gserviceaccount.com
- audiences List<String>
- The list of JWT audiences. that are allowed to access. A JWT containing any of these audiences will be accepted.
GoogleCloudAiplatformV1DeployedIndexAuthConfigResponse
- Auth
Provider Pulumi.Google Native. Aiplatform. V1. Inputs. Google Cloud Aiplatform V1Deployed Index Auth Config Auth Provider Response - Defines the authentication provider that the DeployedIndex uses.
- Auth
Provider GoogleCloud Aiplatform V1Deployed Index Auth Config Auth Provider Response - Defines the authentication provider that the DeployedIndex uses.
- auth
Provider GoogleCloud Aiplatform V1Deployed Index Auth Config Auth Provider Response - Defines the authentication provider that the DeployedIndex uses.
- auth
Provider GoogleCloud Aiplatform V1Deployed Index Auth Config Auth Provider Response - Defines the authentication provider that the DeployedIndex uses.
- auth_
provider GoogleCloud Aiplatform V1Deployed Index Auth Config Auth Provider Response - Defines the authentication provider that the DeployedIndex uses.
- auth
Provider Property Map - Defines the authentication provider that the DeployedIndex uses.
GoogleCloudAiplatformV1DeployedIndexResponse
- Automatic
Resources Pulumi.Google Native. Aiplatform. V1. Inputs. Google Cloud Aiplatform V1Automatic Resources Response - Optional. A description of resources that the DeployedIndex uses, which to large degree are decided by Vertex AI, and optionally allows only a modest additional configuration. If min_replica_count is not set, the default value is 2 (we don't provide SLA when min_replica_count=1). If max_replica_count is not set, the default value is min_replica_count. The max allowed replica count is 1000.
- Create
Time string - Timestamp when the DeployedIndex was created.
- Dedicated
Resources Pulumi.Google Native. Aiplatform. V1. Inputs. Google Cloud Aiplatform V1Dedicated Resources Response - Optional. A description of resources that are dedicated to the DeployedIndex, and that need a higher degree of manual configuration. The field min_replica_count must be set to a value strictly greater than 0, or else validation will fail. We don't provide SLA when min_replica_count=1. If max_replica_count is not set, the default value is min_replica_count. The max allowed replica count is 1000. Available machine types for SMALL shard: e2-standard-2 and all machine types available for MEDIUM and LARGE shard. Available machine types for MEDIUM shard: e2-standard-16 and all machine types available for LARGE shard. Available machine types for LARGE shard: e2-highmem-16, n2d-standard-32. n1-standard-16 and n1-standard-32 are still available, but we recommend e2-standard-16 and e2-highmem-16 for cost efficiency.
- Deployed
Index Pulumi.Auth Config Google Native. Aiplatform. V1. Inputs. Google Cloud Aiplatform V1Deployed Index Auth Config Response - Optional. If set, the authentication is enabled for the private endpoint.
- Deployment
Group string - Optional. The deployment group can be no longer than 64 characters (eg: 'test', 'prod'). If not set, we will use the 'default' deployment group. Creating
deployment_groups
withreserved_ip_ranges
is a recommended practice when the peered network has multiple peering ranges. This creates your deployments from predictable IP spaces for easier traffic administration. Also, one deployment_group (except 'default') can only be used with the same reserved_ip_ranges which means if the deployment_group has been used with reserved_ip_ranges: [a, b, c], using it with [a, b] or [d, e] is disallowed. Note: we only support up to 5 deployment groups(not including 'default'). - Display
Name string - The display name of the DeployedIndex. If not provided upon creation, the Index's display_name is used.
- Enable
Access boolLogging - Optional. If true, private endpoint's access logs are sent to Cloud Logging. These logs are like standard server access logs, containing information like timestamp and latency for each MatchRequest. Note that logs may incur a cost, especially if the deployed index receives a high queries per second rate (QPS). Estimate your costs before enabling this option.
- Index string
- The name of the Index this is the deployment of. We may refer to this Index as the DeployedIndex's "original" Index.
- Index
Sync stringTime - The DeployedIndex may depend on various data on its original Index. Additionally when certain changes to the original Index are being done (e.g. when what the Index contains is being changed) the DeployedIndex may be asynchronously updated in the background to reflect these changes. If this timestamp's value is at least the Index.update_time of the original Index, it means that this DeployedIndex and the original Index are in sync. If this timestamp is older, then to see which updates this DeployedIndex already contains (and which it does not), one must list the operations that are running on the original Index. Only the successfully completed Operations with update_time equal or before this sync time are contained in this DeployedIndex.
- Private
Endpoints Pulumi.Google Native. Aiplatform. V1. Inputs. Google Cloud Aiplatform V1Index Private Endpoints Response - Provides paths for users to send requests directly to the deployed index services running on Cloud via private services access. This field is populated if network is configured.
- Reserved
Ip List<string>Ranges - Optional. A list of reserved ip ranges under the VPC network that can be used for this DeployedIndex. If set, we will deploy the index within the provided ip ranges. Otherwise, the index might be deployed to any ip ranges under the provided VPC network. The value should be the name of the address (https://cloud.google.com/compute/docs/reference/rest/v1/addresses) Example: ['vertex-ai-ip-range']. For more information about subnets and network IP ranges, please see https://cloud.google.com/vpc/docs/subnets#manually_created_subnet_ip_ranges.
- Automatic
Resources GoogleCloud Aiplatform V1Automatic Resources Response - Optional. A description of resources that the DeployedIndex uses, which to large degree are decided by Vertex AI, and optionally allows only a modest additional configuration. If min_replica_count is not set, the default value is 2 (we don't provide SLA when min_replica_count=1). If max_replica_count is not set, the default value is min_replica_count. The max allowed replica count is 1000.
- Create
Time string - Timestamp when the DeployedIndex was created.
- Dedicated
Resources GoogleCloud Aiplatform V1Dedicated Resources Response - Optional. A description of resources that are dedicated to the DeployedIndex, and that need a higher degree of manual configuration. The field min_replica_count must be set to a value strictly greater than 0, or else validation will fail. We don't provide SLA when min_replica_count=1. If max_replica_count is not set, the default value is min_replica_count. The max allowed replica count is 1000. Available machine types for SMALL shard: e2-standard-2 and all machine types available for MEDIUM and LARGE shard. Available machine types for MEDIUM shard: e2-standard-16 and all machine types available for LARGE shard. Available machine types for LARGE shard: e2-highmem-16, n2d-standard-32. n1-standard-16 and n1-standard-32 are still available, but we recommend e2-standard-16 and e2-highmem-16 for cost efficiency.
- Deployed
Index GoogleAuth Config Cloud Aiplatform V1Deployed Index Auth Config Response - Optional. If set, the authentication is enabled for the private endpoint.
- Deployment
Group string - Optional. The deployment group can be no longer than 64 characters (eg: 'test', 'prod'). If not set, we will use the 'default' deployment group. Creating
deployment_groups
withreserved_ip_ranges
is a recommended practice when the peered network has multiple peering ranges. This creates your deployments from predictable IP spaces for easier traffic administration. Also, one deployment_group (except 'default') can only be used with the same reserved_ip_ranges which means if the deployment_group has been used with reserved_ip_ranges: [a, b, c], using it with [a, b] or [d, e] is disallowed. Note: we only support up to 5 deployment groups(not including 'default'). - Display
Name string - The display name of the DeployedIndex. If not provided upon creation, the Index's display_name is used.
- Enable
Access boolLogging - Optional. If true, private endpoint's access logs are sent to Cloud Logging. These logs are like standard server access logs, containing information like timestamp and latency for each MatchRequest. Note that logs may incur a cost, especially if the deployed index receives a high queries per second rate (QPS). Estimate your costs before enabling this option.
- Index string
- The name of the Index this is the deployment of. We may refer to this Index as the DeployedIndex's "original" Index.
- Index
Sync stringTime - The DeployedIndex may depend on various data on its original Index. Additionally when certain changes to the original Index are being done (e.g. when what the Index contains is being changed) the DeployedIndex may be asynchronously updated in the background to reflect these changes. If this timestamp's value is at least the Index.update_time of the original Index, it means that this DeployedIndex and the original Index are in sync. If this timestamp is older, then to see which updates this DeployedIndex already contains (and which it does not), one must list the operations that are running on the original Index. Only the successfully completed Operations with update_time equal or before this sync time are contained in this DeployedIndex.
- Private
Endpoints GoogleCloud Aiplatform V1Index Private Endpoints Response - Provides paths for users to send requests directly to the deployed index services running on Cloud via private services access. This field is populated if network is configured.
- Reserved
Ip []stringRanges - Optional. A list of reserved ip ranges under the VPC network that can be used for this DeployedIndex. If set, we will deploy the index within the provided ip ranges. Otherwise, the index might be deployed to any ip ranges under the provided VPC network. The value should be the name of the address (https://cloud.google.com/compute/docs/reference/rest/v1/addresses) Example: ['vertex-ai-ip-range']. For more information about subnets and network IP ranges, please see https://cloud.google.com/vpc/docs/subnets#manually_created_subnet_ip_ranges.
- automatic
Resources GoogleCloud Aiplatform V1Automatic Resources Response - Optional. A description of resources that the DeployedIndex uses, which to large degree are decided by Vertex AI, and optionally allows only a modest additional configuration. If min_replica_count is not set, the default value is 2 (we don't provide SLA when min_replica_count=1). If max_replica_count is not set, the default value is min_replica_count. The max allowed replica count is 1000.
- create
Time String - Timestamp when the DeployedIndex was created.
- dedicated
Resources GoogleCloud Aiplatform V1Dedicated Resources Response - Optional. A description of resources that are dedicated to the DeployedIndex, and that need a higher degree of manual configuration. The field min_replica_count must be set to a value strictly greater than 0, or else validation will fail. We don't provide SLA when min_replica_count=1. If max_replica_count is not set, the default value is min_replica_count. The max allowed replica count is 1000. Available machine types for SMALL shard: e2-standard-2 and all machine types available for MEDIUM and LARGE shard. Available machine types for MEDIUM shard: e2-standard-16 and all machine types available for LARGE shard. Available machine types for LARGE shard: e2-highmem-16, n2d-standard-32. n1-standard-16 and n1-standard-32 are still available, but we recommend e2-standard-16 and e2-highmem-16 for cost efficiency.
- deployed
Index GoogleAuth Config Cloud Aiplatform V1Deployed Index Auth Config Response - Optional. If set, the authentication is enabled for the private endpoint.
- deployment
Group String - Optional. The deployment group can be no longer than 64 characters (eg: 'test', 'prod'). If not set, we will use the 'default' deployment group. Creating
deployment_groups
withreserved_ip_ranges
is a recommended practice when the peered network has multiple peering ranges. This creates your deployments from predictable IP spaces for easier traffic administration. Also, one deployment_group (except 'default') can only be used with the same reserved_ip_ranges which means if the deployment_group has been used with reserved_ip_ranges: [a, b, c], using it with [a, b] or [d, e] is disallowed. Note: we only support up to 5 deployment groups(not including 'default'). - display
Name String - The display name of the DeployedIndex. If not provided upon creation, the Index's display_name is used.
- enable
Access BooleanLogging - Optional. If true, private endpoint's access logs are sent to Cloud Logging. These logs are like standard server access logs, containing information like timestamp and latency for each MatchRequest. Note that logs may incur a cost, especially if the deployed index receives a high queries per second rate (QPS). Estimate your costs before enabling this option.
- index String
- The name of the Index this is the deployment of. We may refer to this Index as the DeployedIndex's "original" Index.
- index
Sync StringTime - The DeployedIndex may depend on various data on its original Index. Additionally when certain changes to the original Index are being done (e.g. when what the Index contains is being changed) the DeployedIndex may be asynchronously updated in the background to reflect these changes. If this timestamp's value is at least the Index.update_time of the original Index, it means that this DeployedIndex and the original Index are in sync. If this timestamp is older, then to see which updates this DeployedIndex already contains (and which it does not), one must list the operations that are running on the original Index. Only the successfully completed Operations with update_time equal or before this sync time are contained in this DeployedIndex.
- private
Endpoints GoogleCloud Aiplatform V1Index Private Endpoints Response - Provides paths for users to send requests directly to the deployed index services running on Cloud via private services access. This field is populated if network is configured.
- reserved
Ip List<String>Ranges - Optional. A list of reserved ip ranges under the VPC network that can be used for this DeployedIndex. If set, we will deploy the index within the provided ip ranges. Otherwise, the index might be deployed to any ip ranges under the provided VPC network. The value should be the name of the address (https://cloud.google.com/compute/docs/reference/rest/v1/addresses) Example: ['vertex-ai-ip-range']. For more information about subnets and network IP ranges, please see https://cloud.google.com/vpc/docs/subnets#manually_created_subnet_ip_ranges.
- automatic
Resources GoogleCloud Aiplatform V1Automatic Resources Response - Optional. A description of resources that the DeployedIndex uses, which to large degree are decided by Vertex AI, and optionally allows only a modest additional configuration. If min_replica_count is not set, the default value is 2 (we don't provide SLA when min_replica_count=1). If max_replica_count is not set, the default value is min_replica_count. The max allowed replica count is 1000.
- create
Time string - Timestamp when the DeployedIndex was created.
- dedicated
Resources GoogleCloud Aiplatform V1Dedicated Resources Response - Optional. A description of resources that are dedicated to the DeployedIndex, and that need a higher degree of manual configuration. The field min_replica_count must be set to a value strictly greater than 0, or else validation will fail. We don't provide SLA when min_replica_count=1. If max_replica_count is not set, the default value is min_replica_count. The max allowed replica count is 1000. Available machine types for SMALL shard: e2-standard-2 and all machine types available for MEDIUM and LARGE shard. Available machine types for MEDIUM shard: e2-standard-16 and all machine types available for LARGE shard. Available machine types for LARGE shard: e2-highmem-16, n2d-standard-32. n1-standard-16 and n1-standard-32 are still available, but we recommend e2-standard-16 and e2-highmem-16 for cost efficiency.
- deployed
Index GoogleAuth Config Cloud Aiplatform V1Deployed Index Auth Config Response - Optional. If set, the authentication is enabled for the private endpoint.
- deployment
Group string - Optional. The deployment group can be no longer than 64 characters (eg: 'test', 'prod'). If not set, we will use the 'default' deployment group. Creating
deployment_groups
withreserved_ip_ranges
is a recommended practice when the peered network has multiple peering ranges. This creates your deployments from predictable IP spaces for easier traffic administration. Also, one deployment_group (except 'default') can only be used with the same reserved_ip_ranges which means if the deployment_group has been used with reserved_ip_ranges: [a, b, c], using it with [a, b] or [d, e] is disallowed. Note: we only support up to 5 deployment groups(not including 'default'). - display
Name string - The display name of the DeployedIndex. If not provided upon creation, the Index's display_name is used.
- enable
Access booleanLogging - Optional. If true, private endpoint's access logs are sent to Cloud Logging. These logs are like standard server access logs, containing information like timestamp and latency for each MatchRequest. Note that logs may incur a cost, especially if the deployed index receives a high queries per second rate (QPS). Estimate your costs before enabling this option.
- index string
- The name of the Index this is the deployment of. We may refer to this Index as the DeployedIndex's "original" Index.
- index
Sync stringTime - The DeployedIndex may depend on various data on its original Index. Additionally when certain changes to the original Index are being done (e.g. when what the Index contains is being changed) the DeployedIndex may be asynchronously updated in the background to reflect these changes. If this timestamp's value is at least the Index.update_time of the original Index, it means that this DeployedIndex and the original Index are in sync. If this timestamp is older, then to see which updates this DeployedIndex already contains (and which it does not), one must list the operations that are running on the original Index. Only the successfully completed Operations with update_time equal or before this sync time are contained in this DeployedIndex.
- private
Endpoints GoogleCloud Aiplatform V1Index Private Endpoints Response - Provides paths for users to send requests directly to the deployed index services running on Cloud via private services access. This field is populated if network is configured.
- reserved
Ip string[]Ranges - Optional. A list of reserved ip ranges under the VPC network that can be used for this DeployedIndex. If set, we will deploy the index within the provided ip ranges. Otherwise, the index might be deployed to any ip ranges under the provided VPC network. The value should be the name of the address (https://cloud.google.com/compute/docs/reference/rest/v1/addresses) Example: ['vertex-ai-ip-range']. For more information about subnets and network IP ranges, please see https://cloud.google.com/vpc/docs/subnets#manually_created_subnet_ip_ranges.
- automatic_
resources GoogleCloud Aiplatform V1Automatic Resources Response - Optional. A description of resources that the DeployedIndex uses, which to large degree are decided by Vertex AI, and optionally allows only a modest additional configuration. If min_replica_count is not set, the default value is 2 (we don't provide SLA when min_replica_count=1). If max_replica_count is not set, the default value is min_replica_count. The max allowed replica count is 1000.
- create_
time str - Timestamp when the DeployedIndex was created.
- dedicated_
resources GoogleCloud Aiplatform V1Dedicated Resources Response - Optional. A description of resources that are dedicated to the DeployedIndex, and that need a higher degree of manual configuration. The field min_replica_count must be set to a value strictly greater than 0, or else validation will fail. We don't provide SLA when min_replica_count=1. If max_replica_count is not set, the default value is min_replica_count. The max allowed replica count is 1000. Available machine types for SMALL shard: e2-standard-2 and all machine types available for MEDIUM and LARGE shard. Available machine types for MEDIUM shard: e2-standard-16 and all machine types available for LARGE shard. Available machine types for LARGE shard: e2-highmem-16, n2d-standard-32. n1-standard-16 and n1-standard-32 are still available, but we recommend e2-standard-16 and e2-highmem-16 for cost efficiency.
- deployed_
index_ Googleauth_ config Cloud Aiplatform V1Deployed Index Auth Config Response - Optional. If set, the authentication is enabled for the private endpoint.
- deployment_
group str - Optional. The deployment group can be no longer than 64 characters (eg: 'test', 'prod'). If not set, we will use the 'default' deployment group. Creating
deployment_groups
withreserved_ip_ranges
is a recommended practice when the peered network has multiple peering ranges. This creates your deployments from predictable IP spaces for easier traffic administration. Also, one deployment_group (except 'default') can only be used with the same reserved_ip_ranges which means if the deployment_group has been used with reserved_ip_ranges: [a, b, c], using it with [a, b] or [d, e] is disallowed. Note: we only support up to 5 deployment groups(not including 'default'). - display_
name str - The display name of the DeployedIndex. If not provided upon creation, the Index's display_name is used.
- enable_
access_ boollogging - Optional. If true, private endpoint's access logs are sent to Cloud Logging. These logs are like standard server access logs, containing information like timestamp and latency for each MatchRequest. Note that logs may incur a cost, especially if the deployed index receives a high queries per second rate (QPS). Estimate your costs before enabling this option.
- index str
- The name of the Index this is the deployment of. We may refer to this Index as the DeployedIndex's "original" Index.
- index_
sync_ strtime - The DeployedIndex may depend on various data on its original Index. Additionally when certain changes to the original Index are being done (e.g. when what the Index contains is being changed) the DeployedIndex may be asynchronously updated in the background to reflect these changes. If this timestamp's value is at least the Index.update_time of the original Index, it means that this DeployedIndex and the original Index are in sync. If this timestamp is older, then to see which updates this DeployedIndex already contains (and which it does not), one must list the operations that are running on the original Index. Only the successfully completed Operations with update_time equal or before this sync time are contained in this DeployedIndex.
- private_
endpoints GoogleCloud Aiplatform V1Index Private Endpoints Response - Provides paths for users to send requests directly to the deployed index services running on Cloud via private services access. This field is populated if network is configured.
- reserved_
ip_ Sequence[str]ranges - Optional. A list of reserved ip ranges under the VPC network that can be used for this DeployedIndex. If set, we will deploy the index within the provided ip ranges. Otherwise, the index might be deployed to any ip ranges under the provided VPC network. The value should be the name of the address (https://cloud.google.com/compute/docs/reference/rest/v1/addresses) Example: ['vertex-ai-ip-range']. For more information about subnets and network IP ranges, please see https://cloud.google.com/vpc/docs/subnets#manually_created_subnet_ip_ranges.
- automatic
Resources Property Map - Optional. A description of resources that the DeployedIndex uses, which to large degree are decided by Vertex AI, and optionally allows only a modest additional configuration. If min_replica_count is not set, the default value is 2 (we don't provide SLA when min_replica_count=1). If max_replica_count is not set, the default value is min_replica_count. The max allowed replica count is 1000.
- create
Time String - Timestamp when the DeployedIndex was created.
- dedicated
Resources Property Map - Optional. A description of resources that are dedicated to the DeployedIndex, and that need a higher degree of manual configuration. The field min_replica_count must be set to a value strictly greater than 0, or else validation will fail. We don't provide SLA when min_replica_count=1. If max_replica_count is not set, the default value is min_replica_count. The max allowed replica count is 1000. Available machine types for SMALL shard: e2-standard-2 and all machine types available for MEDIUM and LARGE shard. Available machine types for MEDIUM shard: e2-standard-16 and all machine types available for LARGE shard. Available machine types for LARGE shard: e2-highmem-16, n2d-standard-32. n1-standard-16 and n1-standard-32 are still available, but we recommend e2-standard-16 and e2-highmem-16 for cost efficiency.
- deployed
Index Property MapAuth Config - Optional. If set, the authentication is enabled for the private endpoint.
- deployment
Group String - Optional. The deployment group can be no longer than 64 characters (eg: 'test', 'prod'). If not set, we will use the 'default' deployment group. Creating
deployment_groups
withreserved_ip_ranges
is a recommended practice when the peered network has multiple peering ranges. This creates your deployments from predictable IP spaces for easier traffic administration. Also, one deployment_group (except 'default') can only be used with the same reserved_ip_ranges which means if the deployment_group has been used with reserved_ip_ranges: [a, b, c], using it with [a, b] or [d, e] is disallowed. Note: we only support up to 5 deployment groups(not including 'default'). - display
Name String - The display name of the DeployedIndex. If not provided upon creation, the Index's display_name is used.
- enable
Access BooleanLogging - Optional. If true, private endpoint's access logs are sent to Cloud Logging. These logs are like standard server access logs, containing information like timestamp and latency for each MatchRequest. Note that logs may incur a cost, especially if the deployed index receives a high queries per second rate (QPS). Estimate your costs before enabling this option.
- index String
- The name of the Index this is the deployment of. We may refer to this Index as the DeployedIndex's "original" Index.
- index
Sync StringTime - The DeployedIndex may depend on various data on its original Index. Additionally when certain changes to the original Index are being done (e.g. when what the Index contains is being changed) the DeployedIndex may be asynchronously updated in the background to reflect these changes. If this timestamp's value is at least the Index.update_time of the original Index, it means that this DeployedIndex and the original Index are in sync. If this timestamp is older, then to see which updates this DeployedIndex already contains (and which it does not), one must list the operations that are running on the original Index. Only the successfully completed Operations with update_time equal or before this sync time are contained in this DeployedIndex.
- private
Endpoints Property Map - Provides paths for users to send requests directly to the deployed index services running on Cloud via private services access. This field is populated if network is configured.
- reserved
Ip List<String>Ranges - Optional. A list of reserved ip ranges under the VPC network that can be used for this DeployedIndex. If set, we will deploy the index within the provided ip ranges. Otherwise, the index might be deployed to any ip ranges under the provided VPC network. The value should be the name of the address (https://cloud.google.com/compute/docs/reference/rest/v1/addresses) Example: ['vertex-ai-ip-range']. For more information about subnets and network IP ranges, please see https://cloud.google.com/vpc/docs/subnets#manually_created_subnet_ip_ranges.
GoogleCloudAiplatformV1EncryptionSpecResponse
- Kms
Key stringName - The Cloud KMS resource identifier of the customer managed encryption key used to protect a resource. Has the form:
projects/my-project/locations/my-region/keyRings/my-kr/cryptoKeys/my-key
. The key needs to be in the same region as where the compute resource is created.
- Kms
Key stringName - The Cloud KMS resource identifier of the customer managed encryption key used to protect a resource. Has the form:
projects/my-project/locations/my-region/keyRings/my-kr/cryptoKeys/my-key
. The key needs to be in the same region as where the compute resource is created.
- kms
Key StringName - The Cloud KMS resource identifier of the customer managed encryption key used to protect a resource. Has the form:
projects/my-project/locations/my-region/keyRings/my-kr/cryptoKeys/my-key
. The key needs to be in the same region as where the compute resource is created.
- kms
Key stringName - The Cloud KMS resource identifier of the customer managed encryption key used to protect a resource. Has the form:
projects/my-project/locations/my-region/keyRings/my-kr/cryptoKeys/my-key
. The key needs to be in the same region as where the compute resource is created.
- kms_
key_ strname - The Cloud KMS resource identifier of the customer managed encryption key used to protect a resource. Has the form:
projects/my-project/locations/my-region/keyRings/my-kr/cryptoKeys/my-key
. The key needs to be in the same region as where the compute resource is created.
- kms
Key StringName - The Cloud KMS resource identifier of the customer managed encryption key used to protect a resource. Has the form:
projects/my-project/locations/my-region/keyRings/my-kr/cryptoKeys/my-key
. The key needs to be in the same region as where the compute resource is created.
GoogleCloudAiplatformV1IndexPrivateEndpointsResponse
- Match
Grpc stringAddress - The ip address used to send match gRPC requests.
- Service
Attachment string - The name of the service attachment resource. Populated if private service connect is enabled.
- Match
Grpc stringAddress - The ip address used to send match gRPC requests.
- Service
Attachment string - The name of the service attachment resource. Populated if private service connect is enabled.
- match
Grpc StringAddress - The ip address used to send match gRPC requests.
- service
Attachment String - The name of the service attachment resource. Populated if private service connect is enabled.
- match
Grpc stringAddress - The ip address used to send match gRPC requests.
- service
Attachment string - The name of the service attachment resource. Populated if private service connect is enabled.
- match_
grpc_ straddress - The ip address used to send match gRPC requests.
- service_
attachment str - The name of the service attachment resource. Populated if private service connect is enabled.
- match
Grpc StringAddress - The ip address used to send match gRPC requests.
- service
Attachment String - The name of the service attachment resource. Populated if private service connect is enabled.
GoogleCloudAiplatformV1MachineSpecResponse
- Accelerator
Count int - The number of accelerators to attach to the machine.
- Accelerator
Type string - Immutable. The type of accelerator(s) that may be attached to the machine as per accelerator_count.
- Machine
Type string - Immutable. The type of the machine. See the list of machine types supported for prediction See the list of machine types supported for custom training. For DeployedModel this field is optional, and the default value is
n1-standard-2
. For BatchPredictionJob or as part of WorkerPoolSpec this field is required. - Tpu
Topology string - Immutable. The topology of the TPUs. Corresponds to the TPU topologies available from GKE. (Example: tpu_topology: "2x2x1").
- Accelerator
Count int - The number of accelerators to attach to the machine.
- Accelerator
Type string - Immutable. The type of accelerator(s) that may be attached to the machine as per accelerator_count.
- Machine
Type string - Immutable. The type of the machine. See the list of machine types supported for prediction See the list of machine types supported for custom training. For DeployedModel this field is optional, and the default value is
n1-standard-2
. For BatchPredictionJob or as part of WorkerPoolSpec this field is required. - Tpu
Topology string - Immutable. The topology of the TPUs. Corresponds to the TPU topologies available from GKE. (Example: tpu_topology: "2x2x1").
- accelerator
Count Integer - The number of accelerators to attach to the machine.
- accelerator
Type String - Immutable. The type of accelerator(s) that may be attached to the machine as per accelerator_count.
- machine
Type String - Immutable. The type of the machine. See the list of machine types supported for prediction See the list of machine types supported for custom training. For DeployedModel this field is optional, and the default value is
n1-standard-2
. For BatchPredictionJob or as part of WorkerPoolSpec this field is required. - tpu
Topology String - Immutable. The topology of the TPUs. Corresponds to the TPU topologies available from GKE. (Example: tpu_topology: "2x2x1").
- accelerator
Count number - The number of accelerators to attach to the machine.
- accelerator
Type string - Immutable. The type of accelerator(s) that may be attached to the machine as per accelerator_count.
- machine
Type string - Immutable. The type of the machine. See the list of machine types supported for prediction See the list of machine types supported for custom training. For DeployedModel this field is optional, and the default value is
n1-standard-2
. For BatchPredictionJob or as part of WorkerPoolSpec this field is required. - tpu
Topology string - Immutable. The topology of the TPUs. Corresponds to the TPU topologies available from GKE. (Example: tpu_topology: "2x2x1").
- accelerator_
count int - The number of accelerators to attach to the machine.
- accelerator_
type str - Immutable. The type of accelerator(s) that may be attached to the machine as per accelerator_count.
- machine_
type str - Immutable. The type of the machine. See the list of machine types supported for prediction See the list of machine types supported for custom training. For DeployedModel this field is optional, and the default value is
n1-standard-2
. For BatchPredictionJob or as part of WorkerPoolSpec this field is required. - tpu_
topology str - Immutable. The topology of the TPUs. Corresponds to the TPU topologies available from GKE. (Example: tpu_topology: "2x2x1").
- accelerator
Count Number - The number of accelerators to attach to the machine.
- accelerator
Type String - Immutable. The type of accelerator(s) that may be attached to the machine as per accelerator_count.
- machine
Type String - Immutable. The type of the machine. See the list of machine types supported for prediction See the list of machine types supported for custom training. For DeployedModel this field is optional, and the default value is
n1-standard-2
. For BatchPredictionJob or as part of WorkerPoolSpec this field is required. - tpu
Topology String - Immutable. The topology of the TPUs. Corresponds to the TPU topologies available from GKE. (Example: tpu_topology: "2x2x1").
GoogleCloudAiplatformV1PrivateServiceConnectConfigResponse
- Enable
Private boolService Connect - If true, expose the IndexEndpoint via private service connect.
- Project
Allowlist List<string> - A list of Projects from which the forwarding rule will target the service attachment.
- Enable
Private boolService Connect - If true, expose the IndexEndpoint via private service connect.
- Project
Allowlist []string - A list of Projects from which the forwarding rule will target the service attachment.
- enable
Private BooleanService Connect - If true, expose the IndexEndpoint via private service connect.
- project
Allowlist List<String> - A list of Projects from which the forwarding rule will target the service attachment.
- enable
Private booleanService Connect - If true, expose the IndexEndpoint via private service connect.
- project
Allowlist string[] - A list of Projects from which the forwarding rule will target the service attachment.
- enable_
private_ boolservice_ connect - If true, expose the IndexEndpoint via private service connect.
- project_
allowlist Sequence[str] - A list of Projects from which the forwarding rule will target the service attachment.
- enable
Private BooleanService Connect - If true, expose the IndexEndpoint via private service connect.
- project
Allowlist List<String> - A list of Projects from which the forwarding rule will target the service attachment.
Package Details
- Repository
- Google Cloud Native pulumi/pulumi-google-native
- License
- Apache-2.0
Google Cloud Native is in preview. Google Cloud Classic is fully supported.
Google Cloud Native v0.32.0 published on Wednesday, Nov 29, 2023 by Pulumi