Skip to content

"No such file or directory: '/opt/ml/model/model.joblib'" occured at deploy in the local mode of windows #846

@xnaiman

Description

@xnaiman

System Information

  • Framework (e.g. TensorFlow) / Algorithm (e.g. KMeans): Scikit-Learn
  • Framework Version: 0.20.0 (official sagemaker-scikit-learn-container)
  • Python Version: 3.6
  • CPU or GPU: CPU
  • Python SDK Version: 1.26.0
  • Are you using a custom image: No

Describe the problem

No such file or directory: /opt/ml/model/model.joblib occured at deploy in the local mode of windows.
(Raised after avoiding issue #844 and #845)

Minimal repro / logs

Training

  • Logs
Windows Support for Local Mode is Experimental
Creating tmpri2jv76o_algo-1-4cuqr_1 ... done
Attaching to tmpri2jv76o_algo-1-4cuqr_1
algo-1-4cuqr_1  | 2019-06-12 23:57:11,801 sagemaker-containers INFO     Imported framework sagemaker_sklearn_container.training
algo-1-4cuqr_1  | 2019-06-12 23:57:11,811 sagemaker-containers INFO     No GPUs detected (normal if no gpus installed)
algo-1-4cuqr_1  | 2019-06-12 23:57:11,813 botocore.hooks DEBUG    Changing event name from creating-client-class.iot-data to creating-client-class.iot-data-plane
algo-1-4cuqr_1  | 2019-06-12 23:57:11,820 botocore.hooks DEBUG    Changing event name from before-call.apigateway to before-call.api-gateway
algo-1-4cuqr_1  | 2019-06-12 23:57:11,821 botocore.hooks DEBUG    Changing event name from request-created.machinelearning.Predict to request-created.machine-learning.Predict
algo-1-4cuqr_1  | 2019-06-12 23:57:11,825 botocore.hooks DEBUG    Changing event name from before-parameter-build.autoscaling.CreateLaunchConfiguration to before-parameter-build.auto-scaling.CreateLaunchConfiguration
algo-1-4cuqr_1  | 2019-06-12 23:57:11,826 botocore.hooks DEBUG    Changing event name from before-parameter-build.route53 to before-parameter-build.route-53
algo-1-4cuqr_1  | 2019-06-12 23:57:11,826 botocore.hooks DEBUG    Changing event name from request-created.cloudsearchdomain.Search to request-created.cloudsearch-domain.Search
algo-1-4cuqr_1  | 2019-06-12 23:57:11,828 botocore.hooks DEBUG    Changing event name from docs.*.autoscaling.CreateLaunchConfiguration.complete-section to docs.*.auto-scaling.CreateLaunchConfiguration.complete-section
algo-1-4cuqr_1  | 2019-06-12 23:57:11,832 botocore.hooks DEBUG    Changing event name from before-parameter-build.cloudsearchdomain.Search to before-parameter-build.cloudsearch-domain.Search
algo-1-4cuqr_1  | 2019-06-12 23:57:11,833 botocore.hooks DEBUG    Changing event name from docs.*.cloudsearchdomain.Search.complete-section to docs.*.cloudsearch-domain.Search.complete-section
algo-1-4cuqr_1  | 2019-06-12 23:57:11,835 botocore.hooks DEBUG    Changing event name from before-parameter-build.logs.CreateExportTask to before-parameter-build.cloudwatch-logs.CreateExportTask
algo-1-4cuqr_1  | 2019-06-12 23:57:11,835 botocore.hooks DEBUG    Changing event name from docs.*.logs.CreateExportTask.complete-section to docs.*.cloudwatch-logs.CreateExportTask.complete-section
algo-1-4cuqr_1  | 2019-06-12 23:57:11,839 sagemaker_sklearn_container.training INFO     Invoking user training script.
algo-1-4cuqr_1  | 2019-06-12 23:57:11,842 botocore.hooks DEBUG    Changing event name from creating-client-class.iot-data to creating-client-class.iot-data-plane
algo-1-4cuqr_1  | 2019-06-12 23:57:11,845 botocore.hooks DEBUG    Changing event name from before-call.apigateway to before-call.api-gateway
algo-1-4cuqr_1  | 2019-06-12 23:57:11,845 botocore.hooks DEBUG    Changing event name from request-created.machinelearning.Predict to request-created.machine-learning.Predict
algo-1-4cuqr_1  | 2019-06-12 23:57:11,848 botocore.hooks DEBUG    Changing event name from before-parameter-build.autoscaling.CreateLaunchConfiguration to before-parameter-build.auto-scaling.CreateLaunchConfiguration
algo-1-4cuqr_1  | 2019-06-12 23:57:11,848 botocore.hooks DEBUG    Changing event name from before-parameter-build.route53 to before-parameter-build.route-53
algo-1-4cuqr_1  | 2019-06-12 23:57:11,849 botocore.hooks DEBUG    Changing event name from request-created.cloudsearchdomain.Search to request-created.cloudsearch-domain.Search
algo-1-4cuqr_1  | 2019-06-12 23:57:11,851 botocore.hooks DEBUG    Changing event name from docs.*.autoscaling.CreateLaunchConfiguration.complete-section to docs.*.auto-scaling.CreateLaunchConfiguration.complete-section
algo-1-4cuqr_1  | 2019-06-12 23:57:11,854 botocore.hooks DEBUG    Changing event name from before-parameter-build.cloudsearchdomain.Search to before-parameter-build.cloudsearch-domain.Search
algo-1-4cuqr_1  | 2019-06-12 23:57:11,855 botocore.hooks DEBUG    Changing event name from docs.*.cloudsearchdomain.Search.complete-section to docs.*.cloudsearch-domain.Search.complete-section
algo-1-4cuqr_1  | 2019-06-12 23:57:11,855 botocore.hooks DEBUG    Changing event name from before-parameter-build.logs.CreateExportTask to before-parameter-build.cloudwatch-logs.CreateExportTask
algo-1-4cuqr_1  | 2019-06-12 23:57:11,857 botocore.hooks DEBUG    Changing event name from docs.*.logs.CreateExportTask.complete-section to docs.*.cloudwatch-logs.CreateExportTask.complete-section
algo-1-4cuqr_1  | 2019-06-12 23:57:11,863 botocore.loaders DEBUG    Loading JSON file: /usr/local/lib/python3.5/dist-packages/boto3/data/s3/2006-03-01/resources-1.json
algo-1-4cuqr_1  | 2019-06-12 23:57:11,866 botocore.credentials DEBUG    Looking for credentials via: env
algo-1-4cuqr_1  | 2019-06-12 23:57:11,866 botocore.credentials INFO     Found credentials in environment variables.
algo-1-4cuqr_1  | 2019-06-12 23:57:11,866 botocore.loaders DEBUG    Loading JSON file: /usr/local/lib/python3.5/dist-packages/botocore/data/endpoints.json
algo-1-4cuqr_1  | 2019-06-12 23:57:11,871 botocore.hooks DEBUG    Event choose-service-name: calling handler <function handle_service_name_alias at 0x7ff9c9cafa60>
algo-1-4cuqr_1  | 2019-06-12 23:57:11,875 botocore.loaders DEBUG    Loading JSON file: /usr/local/lib/python3.5/dist-packages/botocore/data/s3/2006-03-01/service-2.json
algo-1-4cuqr_1  | 2019-06-12 23:57:11,881 botocore.hooks DEBUG    Event creating-client-class.s3: calling handler <function add_generate_presigned_post at 0x7ff9c7814400>
algo-1-4cuqr_1  | 2019-06-12 23:57:11,881 botocore.hooks DEBUG    Event creating-client-class.s3: calling handler <function lazy_call.<locals>._handler at 0x7ff9c6d08840>
algo-1-4cuqr_1  | 2019-06-12 23:57:11,882 botocore.hooks DEBUG    Event creating-client-class.s3: calling handler <function add_generate_presigned_url at 0x7ff9c78141e0>
algo-1-4cuqr_1  | 2019-06-12 23:57:11,883 botocore.args DEBUG    The s3 config key is not a dictionary type, ignoring its value of: None
algo-1-4cuqr_1  | 2019-06-12 23:57:11,885 botocore.endpoint DEBUG    Setting s3 timeout as (60, 60)
algo-1-4cuqr_1  | 2019-06-12 23:57:11,886 botocore.loaders DEBUG    Loading JSON file: /usr/local/lib/python3.5/dist-packages/botocore/data/_retry.json
algo-1-4cuqr_1  | 2019-06-12 23:57:11,887 botocore.client DEBUG    Registering retry handlers for service: s3
algo-1-4cuqr_1  | 2019-06-12 23:57:11,894 botocore.client DEBUG    Defaulting to S3 virtual host style addressing with path style addressing fallback.
algo-1-4cuqr_1  | 2019-06-12 23:57:11,895 boto3.resources.factory DEBUG    Loading s3:s3
algo-1-4cuqr_1  | 2019-06-12 23:57:11,896 boto3.resources.factory DEBUG    Loading s3:Bucket
algo-1-4cuqr_1  | 2019-06-12 23:57:11,897 boto3.resources.model DEBUG    Renaming Bucket attribute name
algo-1-4cuqr_1  | 2019-06-12 23:57:11,898 botocore.hooks DEBUG    Event creating-resource-class.s3.Bucket: calling handler <function lazy_call.<locals>._handler at 0x7ff9c6d08950>
algo-1-4cuqr_1  | 2019-06-12 23:57:11,899 s3transfer.utils DEBUG    Acquiring 0
algo-1-4cuqr_1  | 2019-06-12 23:57:11,899 s3transfer.tasks DEBUG    DownloadSubmissionTask(transfer_id=0, {'transfer_future': <s3transfer.futures.TransferFuture object at 0x7ff9c6ae4278>}) about to wait for the following futures []
algo-1-4cuqr_1  | 2019-06-12 23:57:11,900 s3transfer.tasks DEBUG    DownloadSubmissionTask(transfer_id=0, {'transfer_future': <s3transfer.futures.TransferFuture object at 0x7ff9c6ae4278>}) done waiting for dependent futures
algo-1-4cuqr_1  | 2019-06-12 23:57:11,900 s3transfer.tasks DEBUG    Executing task DownloadSubmissionTask(transfer_id=0, {'transfer_future': <s3transfer.futures.TransferFuture object at 0x7ff9c6ae4278>}) with kwargs {'config': <boto3.s3.transfer.TransferConfig object at 0x7ff9c6b5c710>, 'io_executor': <s3transfer.futures.BoundedExecutor object at 0x7ff9c6b5cef0>, 'osutil': <s3transfer.utils.OSUtils object at 0x7ff9c6b5c8d0>, 'request_executor': <s3transfer.futures.BoundedExecutor object at 0x7ff9c6b5cac8>, 'client': <botocore.client.S3 object at 0x7ff9c6b2b048>, 'transfer_future': <s3transfer.futures.TransferFuture object at 0x7ff9c6ae4278>}
algo-1-4cuqr_1  | 2019-06-12 23:57:11,900 botocore.hooks DEBUG    Event before-parameter-build.s3.HeadObject: calling handler <function sse_md5 at 0x7ff9c9ccb730>
algo-1-4cuqr_1  | 2019-06-12 23:57:11,900 botocore.hooks DEBUG    Event before-parameter-build.s3.HeadObject: calling handler <function validate_bucket_name at 0x7ff9c9ccb6a8>
algo-1-4cuqr_1  | 2019-06-12 23:57:11,901 botocore.hooks DEBUG    Event before-parameter-build.s3.HeadObject: calling handler <bound method S3RegionRedirector.redirect_from_cache of <botocore.utils.S3RegionRedirector object at 0x7ff9c6b2b5f8>>
algo-1-4cuqr_1  | 2019-06-12 23:57:11,901 botocore.hooks DEBUG    Event before-parameter-build.s3.HeadObject: calling handler <function generate_idempotent_uuid at 0x7ff9c9ccb2f0>
algo-1-4cuqr_1  | 2019-06-12 23:57:11,901 botocore.hooks DEBUG    Event before-call.s3.HeadObject: calling handler <function add_expect_header at 0x7ff9c9ccbb70>
algo-1-4cuqr_1  | 2019-06-12 23:57:11,901 botocore.hooks DEBUG    Event before-call.s3.HeadObject: calling handler <bound method S3RegionRedirector.set_request_url of <botocore.utils.S3RegionRedirector object at 0x7ff9c6b2b5f8>>
algo-1-4cuqr_1  | 2019-06-12 23:57:11,901 botocore.endpoint DEBUG    Making request for OperationModel(name=HeadObject) with params: {'url': 'https://s3.ap-northeast-1.amazonaws.com/sagemaker-ap-northeast-1-130747742019/sagemaker-scikit-learn-2019-06-12-23-57-07-460/source/sourcedir.tar.gz', 'context': {'client_config': <botocore.config.Config object at 0x7ff9c6b2b518>, 'has_streaming_input': False, 'client_region': 'ap-northeast-1',
'auth_type': None, 'signing': {'bucket': 'sagemaker-ap-northeast-1-130747742019'}}, 'body': b'', 'method': 'HEAD', 'query_string': {}, 'url_path': '/sagemaker-ap-northeast-1-130747742019/sagemaker-scikit-learn-2019-06-12-23-57-07-460/source/sourcedir.tar.gz', 'headers': {'User-Agent': 'Boto3/1.9.51 Python/3.5.2 Linux/4.9.125-linuxkit Botocore/1.12.51 Resource'}}
algo-1-4cuqr_1  | 2019-06-12 23:57:11,902 botocore.hooks DEBUG    Event request-created.s3.HeadObject: calling handler <function signal_not_transferring at 0x7ff9c7659730>
algo-1-4cuqr_1  | 2019-06-12 23:57:11,902 botocore.hooks DEBUG    Event request-created.s3.HeadObject: calling handler <bound method RequestSigner.handler of <botocore.signers.RequestSigner object at 0x7ff9c6b2b4a8>>
algo-1-4cuqr_1  | 2019-06-12 23:57:11,902 botocore.hooks DEBUG    Event choose-signer.s3.HeadObject: calling handler <bound method ClientCreator._default_s3_presign_to_sigv2 of <botocore.client.ClientCreator object at 0x7ff9c6c67390>>
algo-1-4cuqr_1  | 2019-06-12 23:57:11,902 botocore.hooks DEBUG    Event choose-signer.s3.HeadObject: calling handler <function set_operation_specific_signer at 0x7ff9c9ccb1e0>
algo-1-4cuqr_1  | 2019-06-12 23:57:11,903 botocore.hooks DEBUG    Event before-sign.s3.HeadObject: calling handler <function fix_s3_host at 0x7ff9c7914730>
algo-1-4cuqr_1  | 2019-06-12 23:57:11,903 botocore.utils DEBUG    Checking for DNS compatible bucket for: https://s3.ap-northeast-1.amazonaws.com/sagemaker-ap-northeast-1-130747742019/sagemaker-scikit-learn-2019-06-12-23-57-07-460/source/sourcedir.tar.gz
algo-1-4cuqr_1  | 2019-06-12 23:57:11,904 botocore.utils DEBUG    URI updated to: https://sagemaker-ap-northeast-1-130747742019.s3.ap-northeast-1.amazonaws.com/sagemaker-scikit-learn-2019-06-12-23-57-07-460/source/sourcedir.tar.gz
algo-1-4cuqr_1  | 2019-06-12 23:57:11,904 botocore.auth DEBUG    Calculating signature using v4 auth.
algo-1-4cuqr_1  | 2019-06-12 23:57:11,904 botocore.auth DEBUG    CanonicalRequest:
algo-1-4cuqr_1  | HEAD
algo-1-4cuqr_1  | /sagemaker-scikit-learn-2019-06-12-23-57-07-460/source/sourcedir.tar.gz
algo-1-4cuqr_1  |
algo-1-4cuqr_1  | host:sagemaker-ap-northeast-1-130747742019.s3.ap-northeast-1.amazonaws.com
algo-1-4cuqr_1  | x-amz-content-sha256:e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
algo-1-4cuqr_1  | x-amz-date:20190612T235711Z
algo-1-4cuqr_1  |
algo-1-4cuqr_1  | host;x-amz-content-sha256;x-amz-date
algo-1-4cuqr_1  | e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
algo-1-4cuqr_1  | 2019-06-12 23:57:11,904 botocore.auth DEBUG    StringToSign:
algo-1-4cuqr_1  | AWS4-HMAC-SHA256
algo-1-4cuqr_1  | 20190612T235711Z
algo-1-4cuqr_1  | 20190612/ap-northeast-1/s3/aws4_request
algo-1-4cuqr_1  | 3c7d2e923cfbc1df5b252ea8422724d8ecd6c607014af9b27717fcbdf25c94d1
algo-1-4cuqr_1  | 2019-06-12 23:57:11,905 botocore.auth DEBUG    Signature:
algo-1-4cuqr_1  | 43919605f5db884d3d34ed4b5732ace0779cf7c138fbf01b1f1e2b61151ed66f
algo-1-4cuqr_1  | 2019-06-12 23:57:11,905 botocore.hooks DEBUG    Event request-created.s3.HeadObject: calling handler <function signal_transferring at 0x7ff9c76597b8>
algo-1-4cuqr_1  | 2019-06-12 23:57:11,905 botocore.endpoint DEBUG    Sending http request: <AWSPreparedRequest stream_output=False, method=HEAD, url=https://sagemaker-ap-northeast-1-130747742019.s3.ap-northeast-1.amazonaws.com/sagemaker-scikit-learn-2019-06-12-23-57-07-460/source/sourcedir.tar.gz, headers={'X-Amz-Date': b'20190612T235711Z', 'User-Agent': b'Boto3/1.9.51 Python/3.5.2 Linux/4.9.125-linuxkit Botocore/1.12.51 Resource', 'X-Amz-Content-SHA256': b'e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855', 'Authorization': b'AWS4-HMAC-SHA256 Credential=AKIAR44JMG5BUD7WQML6/20190612/ap-northeast-1/s3/aws4_request, SignedHeaders=host;x-amz-content-sha256;x-amz-date, Signature=43919605f5db884d3d34ed4b5732ace0779cf7c138fbf01b1f1e2b61151ed66f'}>
algo-1-4cuqr_1  | 2019-06-12 23:57:11,906 urllib3.util.retry DEBUG    Converted retries value: False -> Retry(total=False, connect=None, read=None, redirect=0, status=None)
algo-1-4cuqr_1  | 2019-06-12 23:57:11,906 urllib3.connectionpool DEBUG    Starting new HTTPS connection (1): sagemaker-ap-northeast-1-130747742019.s3.ap-northeast-1.amazonaws.com:443
algo-1-4cuqr_1  | 2019-06-12 23:57:12,024 urllib3.connectionpool DEBUG    https://sagemaker-ap-northeast-1-130747742019.s3.ap-northeast-1.amazonaws.com:443 "HEAD /sagemaker-scikit-learn-2019-06-12-23-57-07-460/source/sourcedir.tar.gz HTTP/1.1" 200 0
algo-1-4cuqr_1  | 2019-06-12 23:57:12,025 botocore.parsers DEBUG    Response headers: {'ETag': '"2a30f306729c3090cfaa4822ab8854f0"', 'x-amz-request-id': 'F48E7D6B6A526E82', 'Server': 'AmazonS3', 'Accept-Ranges': 'bytes', 'Content-Type': 'binary/octet-stream', 'x-amz-id-2': 'cAB/AbGlPTnTc025AlpNhnwIFzC34NmUPjtLJAUp7Tf50k4QwAxC33+RE0jTSrnG7qafBtR38Fs=', 'Date': 'Wed, 12 Jun 2019 23:57:25 GMT', 'Content-Length': '7271', 'Last-Modified':
'Wed, 12 Jun 2019 23:57:23 GMT'}
algo-1-4cuqr_1  | 2019-06-12 23:57:12,025 botocore.parsers DEBUG    Response body:
algo-1-4cuqr_1  | b''
algo-1-4cuqr_1  | 2019-06-12 23:57:12,026 botocore.hooks DEBUG    Event needs-retry.s3.HeadObject: calling handler <botocore.retryhandler.RetryHandler object at 0x7ff9c6b2bf60>
algo-1-4cuqr_1  | 2019-06-12 23:57:12,026 botocore.retryhandler DEBUG    No retry needed.
algo-1-4cuqr_1  | 2019-06-12 23:57:12,026 botocore.hooks DEBUG    Event needs-retry.s3.HeadObject: calling handler <bound method S3RegionRedirector.redirect_from_error of <botocore.utils.S3RegionRedirector object at 0x7ff9c6b2b5f8>>
algo-1-4cuqr_1  | 2019-06-12 23:57:12,027 s3transfer.futures DEBUG    Submitting task ImmediatelyWriteIOGetObjectTask(transfer_id=0, {'bucket': 'sagemaker-ap-northeast-1-130747742019', 'key': 'sagemaker-scikit-learn-2019-06-12-23-57-07-460/source/sourcedir.tar.gz', 'extra_args': {}}) to executor <s3transfer.futures.BoundedExecutor object at 0x7ff9c6b5cac8> for transfer request: 0.
algo-1-4cuqr_1  | 2019-06-12 23:57:12,027 s3transfer.utils DEBUG    Acquiring 0
algo-1-4cuqr_1  | 2019-06-12 23:57:12,028 s3transfer.tasks DEBUG    ImmediatelyWriteIOGetObjectTask(transfer_id=0, {'bucket': 'sagemaker-ap-northeast-1-130747742019', 'key': 'sagemaker-scikit-learn-2019-06-12-23-57-07-460/source/sourcedir.tar.gz', 'extra_args': {}}) about to wait for the following futures []
algo-1-4cuqr_1  | 2019-06-12 23:57:12,028 s3transfer.tasks DEBUG    ImmediatelyWriteIOGetObjectTask(transfer_id=0, {'bucket': 'sagemaker-ap-northeast-1-130747742019', 'key': 'sagemaker-scikit-learn-2019-06-12-23-57-07-460/source/sourcedir.tar.gz', 'extra_args': {}}) done waiting for dependent futures
algo-1-4cuqr_1  | 2019-06-12 23:57:12,028 s3transfer.tasks DEBUG    Executing task ImmediatelyWriteIOGetObjectTask(transfer_id=0, {'bucket': 'sagemaker-ap-northeast-1-130747742019', 'key': 'sagemaker-scikit-learn-2019-06-12-23-57-07-460/source/sourcedir.tar.gz', 'extra_args': {}}) with kwargs {'max_attempts': 5, 'bucket': 'sagemaker-ap-northeast-1-130747742019', 'bandwidth_limiter': None, 'io_chunksize': 262144, 'fileobj': <s3transfer.utils.DeferredOpenFile object at 0x7ff9c6b12cf8>, 'callbacks': [], 'key': 'sagemaker-scikit-learn-2019-06-12-23-57-07-460/source/sourcedir.tar.gz', 'client': <botocore.client.S3 object at 0x7ff9c6b2b048>, 'download_output_manager': <s3transfer.download.DownloadFilenameOutputManager object at 0x7ff9c6b12cc0>, 'extra_args': {}}
algo-1-4cuqr_1  | 2019-06-12 23:57:12,028 botocore.hooks DEBUG    Event before-parameter-build.s3.GetObject: calling handler <function sse_md5 at 0x7ff9c9ccb730>
algo-1-4cuqr_1  | 2019-06-12 23:57:12,028 s3transfer.utils DEBUG    Releasing acquire 0/None
algo-1-4cuqr_1  | 2019-06-12 23:57:12,029 botocore.hooks DEBUG    Event before-parameter-build.s3.GetObject: calling handler <function validate_bucket_name at 0x7ff9c9ccb6a8>
algo-1-4cuqr_1  | 2019-06-12 23:57:12,029 botocore.hooks DEBUG    Event before-parameter-build.s3.GetObject: calling handler <bound method S3RegionRedirector.redirect_from_cache of <botocore.utils.S3RegionRedirector object at 0x7ff9c6b2b5f8>>
algo-1-4cuqr_1  | 2019-06-12 23:57:12,029 botocore.hooks DEBUG    Event before-parameter-build.s3.GetObject: calling handler <function generate_idempotent_uuid at 0x7ff9c9ccb2f0>
algo-1-4cuqr_1  | 2019-06-12 23:57:12,029 botocore.hooks DEBUG    Event before-call.s3.GetObject: calling handler <function add_expect_header at 0x7ff9c9ccbb70>
algo-1-4cuqr_1  | 2019-06-12 23:57:12,029 botocore.hooks DEBUG    Event before-call.s3.GetObject: calling handler <bound method S3RegionRedirector.set_request_url of <botocore.utils.S3RegionRedirector object at 0x7ff9c6b2b5f8>>
algo-1-4cuqr_1  | 2019-06-12 23:57:12,029 botocore.endpoint DEBUG    Making request for OperationModel(name=GetObject) with params: {'url': 'https://s3.ap-northeast-1.amazonaws.com/sagemaker-ap-northeast-1-130747742019/sagemaker-scikit-learn-2019-06-12-23-57-07-460/source/sourcedir.tar.gz', 'context': {'client_config': <botocore.config.Config object at 0x7ff9c6b2b518>, 'has_streaming_input': False, 'client_region': 'ap-northeast-1', 'auth_type': None, 'signing': {'bucket': 'sagemaker-ap-northeast-1-130747742019'}}, 'body': b'', 'method': 'GET', 'query_string': {}, 'url_path': '/sagemaker-ap-northeast-1-130747742019/sagemaker-scikit-learn-2019-06-12-23-57-07-460/source/sourcedir.tar.gz', 'headers': {'User-Agent': 'Boto3/1.9.51 Python/3.5.2 Linux/4.9.125-linuxkit Botocore/1.12.51 Resource'}}
algo-1-4cuqr_1  | 2019-06-12 23:57:12,030 botocore.hooks DEBUG    Event request-created.s3.GetObject: calling handler <function signal_not_transferring at 0x7ff9c7659730>
algo-1-4cuqr_1  | 2019-06-12 23:57:12,030 botocore.hooks DEBUG    Event request-created.s3.GetObject: calling handler <bound method RequestSigner.handler of <botocore.signers.RequestSigner object at 0x7ff9c6b2b4a8>>
algo-1-4cuqr_1  | 2019-06-12 23:57:12,030 botocore.hooks DEBUG    Event choose-signer.s3.GetObject: calling handler <bound method ClientCreator._default_s3_presign_to_sigv2 of <botocore.client.ClientCreator object at 0x7ff9c6c67390>>
algo-1-4cuqr_1  | 2019-06-12 23:57:12,030 botocore.hooks DEBUG    Event choose-signer.s3.GetObject: calling handler <function set_operation_specific_signer at 0x7ff9c9ccb1e0>
algo-1-4cuqr_1  | 2019-06-12 23:57:12,030 botocore.hooks DEBUG    Event before-sign.s3.GetObject: calling handler <function fix_s3_host at 0x7ff9c7914730>
algo-1-4cuqr_1  | 2019-06-12 23:57:12,030 botocore.utils DEBUG    Checking for DNS compatible bucket for: https://s3.ap-northeast-1.amazonaws.com/sagemaker-ap-northeast-1-130747742019/sagemaker-scikit-learn-2019-06-12-23-57-07-460/source/sourcedir.tar.gz
algo-1-4cuqr_1  | 2019-06-12 23:57:12,030 botocore.utils DEBUG    URI updated to: https://sagemaker-ap-northeast-1-130747742019.s3.ap-northeast-1.amazonaws.com/sagemaker-scikit-learn-2019-06-12-23-57-07-460/source/sourcedir.tar.gz
algo-1-4cuqr_1  | 2019-06-12 23:57:12,030 botocore.auth DEBUG    Calculating signature using v4 auth.
algo-1-4cuqr_1  | 2019-06-12 23:57:12,030 botocore.auth DEBUG    CanonicalRequest:
algo-1-4cuqr_1  | GET
algo-1-4cuqr_1  | /sagemaker-scikit-learn-2019-06-12-23-57-07-460/source/sourcedir.tar.gz
algo-1-4cuqr_1  |
algo-1-4cuqr_1  | host:sagemaker-ap-northeast-1-130747742019.s3.ap-northeast-1.amazonaws.com
algo-1-4cuqr_1  | x-amz-content-sha256:e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
algo-1-4cuqr_1  | x-amz-date:20190612T235712Z
algo-1-4cuqr_1  |
algo-1-4cuqr_1  | host;x-amz-content-sha256;x-amz-date
algo-1-4cuqr_1  | e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
algo-1-4cuqr_1  | 2019-06-12 23:57:12,031 botocore.auth DEBUG    StringToSign:
algo-1-4cuqr_1  | AWS4-HMAC-SHA256
algo-1-4cuqr_1  | 20190612T235712Z
algo-1-4cuqr_1  | 20190612/ap-northeast-1/s3/aws4_request
algo-1-4cuqr_1  | d5310e2e1112cc81c0cbcd2fdb9f2f8fe645ea026d34a43d2a5cd9ce8cf1b5d3
algo-1-4cuqr_1  | 2019-06-12 23:57:12,031 botocore.auth DEBUG    Signature:
algo-1-4cuqr_1  | 77bc5f5a49de3134943f04f965e28fb747c4caed3e0f466dca439447f71291e4
algo-1-4cuqr_1  | 2019-06-12 23:57:12,031 botocore.hooks DEBUG    Event request-created.s3.GetObject: calling handler <function signal_transferring at 0x7ff9c76597b8>
algo-1-4cuqr_1  | 2019-06-12 23:57:12,031 botocore.endpoint DEBUG    Sending http request: <AWSPreparedRequest stream_output=True, method=GET, url=https://sagemaker-ap-northeast-1-130747742019.s3.ap-northeast-1.amazonaws.com/sagemaker-scikit-learn-2019-06-12-23-57-07-460/source/sourcedir.tar.gz, headers={'X-Amz-Date': b'20190612T235712Z', 'User-Agent': b'Boto3/1.9.51 Python/3.5.2 Linux/4.9.125-linuxkit Botocore/1.12.51 Resource', 'X-Amz-Content-SHA256': b'e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855', 'Authorization': b'AWS4-HMAC-SHA256 Credential=AKIAR44JMG5BUD7WQML6/20190612/ap-northeast-1/s3/aws4_request, SignedHeaders=host;x-amz-content-sha256;x-amz-date, Signature=77bc5f5a49de3134943f04f965e28fb747c4caed3e0f466dca439447f71291e4'}>
algo-1-4cuqr_1  | 2019-06-12 23:57:12,031 urllib3.util.retry DEBUG    Converted retries value: False -> Retry(total=False, connect=None, read=None, redirect=0, status=None)
algo-1-4cuqr_1  | 2019-06-12 23:57:12,066 urllib3.connectionpool DEBUG    https://sagemaker-ap-northeast-1-130747742019.s3.ap-northeast-1.amazonaws.com:443 "GET /sagemaker-scikit-learn-2019-06-12-23-57-07-460/source/sourcedir.tar.gz HTTP/1.1" 200 7271
algo-1-4cuqr_1  | 2019-06-12 23:57:12,066 botocore.parsers DEBUG    Response headers: {'ETag': '"2a30f306729c3090cfaa4822ab8854f0"', 'x-amz-request-id': '7250ADF3F5A5A49C', 'Server': 'AmazonS3', 'Accept-Ranges': 'bytes', 'Content-Type': 'binary/octet-stream', 'x-amz-id-2': 'WiYMANCqDMYoqe7ZcCVyUJTpmRTWAQ9GRSIeW6JAlxtmugzNooQgP8bb/pnTQ+PBo5seJvLVhoc=', 'Date': 'Wed, 12 Jun 2019 23:57:25 GMT', 'Content-Length': '7271', 'Last-Modified':
'Wed, 12 Jun 2019 23:57:23 GMT'}
algo-1-4cuqr_1  | 2019-06-12 23:57:12,067 botocore.parsers DEBUG    Response body:
algo-1-4cuqr_1  | <botocore.response.StreamingBody object at 0x7ff9c6aba048>
algo-1-4cuqr_1  | 2019-06-12 23:57:12,067 botocore.hooks DEBUG    Event needs-retry.s3.GetObject: calling handler <botocore.retryhandler.RetryHandler object at 0x7ff9c6b2bf60>
algo-1-4cuqr_1  | 2019-06-12 23:57:12,068 botocore.retryhandler DEBUG    No retry needed.
algo-1-4cuqr_1  | 2019-06-12 23:57:12,068 botocore.hooks DEBUG    Event needs-retry.s3.GetObject: calling handler <bound method S3RegionRedirector.redirect_from_error of <botocore.utils.S3RegionRedirector object at 0x7ff9c6b2b5f8>>
algo-1-4cuqr_1  | 2019-06-12 23:57:12,076 s3transfer.tasks DEBUG    IOWriteTask(transfer_id=0, {'offset': 0}) about to wait for the following futures []
algo-1-4cuqr_1  | 2019-06-12 23:57:12,076 s3transfer.tasks DEBUG    IOWriteTask(transfer_id=0, {'offset': 0}) done waiting for dependent futures
algo-1-4cuqr_1  | 2019-06-12 23:57:12,076 s3transfer.tasks DEBUG    Executing task IOWriteTask(transfer_id=0, {'offset': 0}) with kwargs {'offset': 0, 'fileobj': <s3transfer.utils.DeferredOpenFile object at 0x7ff9c6b12cf8>}
algo-1-4cuqr_1  | 2019-06-12 23:57:12,077 s3transfer.tasks DEBUG    IORenameFileTask(transfer_id=0, {'final_filename': '/tmp/tmp_f99p1cx/tar_file'}) about to wait for the following futures []
algo-1-4cuqr_1  | 2019-06-12 23:57:12,077 s3transfer.tasks DEBUG    IORenameFileTask(transfer_id=0, {'final_filename': '/tmp/tmp_f99p1cx/tar_file'}) done waiting for dependent futures
algo-1-4cuqr_1  | 2019-06-12 23:57:12,077 s3transfer.tasks DEBUG    Executing task IORenameFileTask(transfer_id=0, {'final_filename': '/tmp/tmp_f99p1cx/tar_file'}) with kwargs {'fileobj': <s3transfer.utils.DeferredOpenFile object at 0x7ff9c6b12cf8>, 'osutil': <s3transfer.utils.OSUtils object at 0x7ff9c6b5c8d0>, 'final_filename': '/tmp/tmp_f99p1cx/tar_file'}
algo-1-4cuqr_1  | 2019-06-12 23:57:12,077 s3transfer.utils DEBUG    Releasing acquire 0/None
algo-1-4cuqr_1  | 2019-06-12 23:57:12,084 sagemaker-containers INFO     Module train_model does not provide a setup.py.
algo-1-4cuqr_1  | Generating setup.py
algo-1-4cuqr_1  | 2019-06-12 23:57:12,084 sagemaker-containers INFO     Generating setup.cfg
algo-1-4cuqr_1  | 2019-06-12 23:57:12,084 sagemaker-containers INFO     Generating MANIFEST.in
algo-1-4cuqr_1  | 2019-06-12 23:57:12,084 sagemaker-containers INFO     Installing module with the following command:
algo-1-4cuqr_1  | /usr/bin/python3 -m pip install -U .
algo-1-4cuqr_1  | Processing /opt/ml/code
algo-1-4cuqr_1  | Building wheels for collected packages: train-model
algo-1-4cuqr_1  |   Running setup.py bdist_wheel for train-model ... done
algo-1-4cuqr_1  |   Stored in directory: /tmp/pip-ephem-wheel-cache-d58g4m1m/wheels/35/24/16/37574d11bf9bde50616c67372a334f94fa8356bc7164af8ca3
algo-1-4cuqr_1  | Successfully built train-model
algo-1-4cuqr_1  | Installing collected packages: train-model
algo-1-4cuqr_1  | Successfully installed train-model-1.0.0
algo-1-4cuqr_1  | You are using pip version 18.1, however version 19.1.1 is available.
algo-1-4cuqr_1  | You should consider upgrading via the 'pip install --upgrade pip' command.
algo-1-4cuqr_1  | 2019-06-12 23:57:12,983 sagemaker-containers INFO     No GPUs detected (normal if no gpus installed)
algo-1-4cuqr_1  | 2019-06-12 23:57:12,984 botocore.hooks DEBUG    Changing event name from creating-client-class.iot-data to creating-client-class.iot-data-plane
algo-1-4cuqr_1  | 2019-06-12 23:57:12,987 botocore.hooks DEBUG    Changing event name from before-call.apigateway to before-call.api-gateway
algo-1-4cuqr_1  | 2019-06-12 23:57:12,988 botocore.hooks DEBUG    Changing event name from request-created.machinelearning.Predict to request-created.machine-learning.Predict
algo-1-4cuqr_1  | 2019-06-12 23:57:12,990 botocore.hooks DEBUG    Changing event name from before-parameter-build.autoscaling.CreateLaunchConfiguration to before-parameter-build.auto-scaling.CreateLaunchConfiguration
algo-1-4cuqr_1  | 2019-06-12 23:57:12,990 botocore.hooks DEBUG    Changing event name from before-parameter-build.route53 to before-parameter-build.route-53
algo-1-4cuqr_1  | 2019-06-12 23:57:12,990 botocore.hooks DEBUG    Changing event name from request-created.cloudsearchdomain.Search to request-created.cloudsearch-domain.Search
algo-1-4cuqr_1  | 2019-06-12 23:57:12,991 botocore.hooks DEBUG    Changing event name from docs.*.autoscaling.CreateLaunchConfiguration.complete-section to docs.*.auto-scaling.CreateLaunchConfiguration.complete-section
algo-1-4cuqr_1  | 2019-06-12 23:57:12,995 botocore.hooks DEBUG    Changing event name from before-parameter-build.cloudsearchdomain.Search to before-parameter-build.cloudsearch-domain.Search
algo-1-4cuqr_1  | 2019-06-12 23:57:12,995 botocore.hooks DEBUG    Changing event name from docs.*.cloudsearchdomain.Search.complete-section to docs.*.cloudsearch-domain.Search.complete-section
algo-1-4cuqr_1  | 2019-06-12 23:57:12,995 botocore.hooks DEBUG    Changing event name from before-parameter-build.logs.CreateExportTask to before-parameter-build.cloudwatch-logs.CreateExportTask
algo-1-4cuqr_1  | 2019-06-12 23:57:12,996 botocore.hooks DEBUG    Changing event name from docs.*.logs.CreateExportTask.complete-section to docs.*.cloudwatch-logs.CreateExportTask.complete-section
algo-1-4cuqr_1  | 2019-06-12 23:57:12,998 sagemaker-containers INFO     Invoking user script
algo-1-4cuqr_1  |
algo-1-4cuqr_1  | Training Env:
algo-1-4cuqr_1  |
algo-1-4cuqr_1  | {
algo-1-4cuqr_1  |     "hosts": [
algo-1-4cuqr_1  |         "algo-1-4cuqr"
algo-1-4cuqr_1  |     ],
algo-1-4cuqr_1  |     "output_dir": "/opt/ml/output",
algo-1-4cuqr_1  |     "user_entry_point": "train_model.py",
algo-1-4cuqr_1  |     "current_host": "algo-1-4cuqr",
algo-1-4cuqr_1  |     "input_data_config": {
algo-1-4cuqr_1  |         "train": {
algo-1-4cuqr_1  |             "TrainingInputMode": "File"
algo-1-4cuqr_1  |         }
algo-1-4cuqr_1  |     },
algo-1-4cuqr_1  |     "output_intermediate_dir": "/opt/ml/output/intermediate",
algo-1-4cuqr_1  |     "input_dir": "/opt/ml/input",
algo-1-4cuqr_1  |     "output_data_dir": "/opt/ml/output/data",
algo-1-4cuqr_1  |     "num_gpus": 0,
algo-1-4cuqr_1  |     "log_level": 10,
algo-1-4cuqr_1  |     "resource_config": {
algo-1-4cuqr_1  |         "hosts": [
algo-1-4cuqr_1  |             "algo-1-4cuqr"
algo-1-4cuqr_1  |         ],
algo-1-4cuqr_1  |         "current_host": "algo-1-4cuqr"
algo-1-4cuqr_1  |     },
algo-1-4cuqr_1  |     "job_name": "sagemaker-scikit-learn-2019-06-12-23-57-07-460",
algo-1-4cuqr_1  |     "additional_framework_parameters": {},
algo-1-4cuqr_1  |     "input_config_dir": "/opt/ml/input/config",
algo-1-4cuqr_1  |     "network_interface_name": "ethwe",
algo-1-4cuqr_1  |     "module_dir": "s3://sagemaker-ap-northeast-1-130747742019/sagemaker-scikit-learn-2019-06-12-23-57-07-460/source/sourcedir.tar.gz",
algo-1-4cuqr_1  |     "framework_module": "sagemaker_sklearn_container.training:main",
algo-1-4cuqr_1  |     "channel_input_dirs": {
algo-1-4cuqr_1  |         "train": "/opt/ml/input/data/train"
algo-1-4cuqr_1  |     },
algo-1-4cuqr_1  |     "module_name": "train_model",
algo-1-4cuqr_1  |     "model_dir": "/opt/ml/model",
algo-1-4cuqr_1  |     "hyperparameters": {
algo-1-4cuqr_1  |         "para_c": 1.0,
algo-1-4cuqr_1  |         "para_gamma": 0.01
algo-1-4cuqr_1  |     },
algo-1-4cuqr_1  |     "num_cpus": 4
algo-1-4cuqr_1  | }
algo-1-4cuqr_1  |
algo-1-4cuqr_1  | Environment variables:
algo-1-4cuqr_1  |
algo-1-4cuqr_1  | PYTHONPATH=/usr/local/bin:/usr/lib/python35.zip:/usr/lib/python3.5:/usr/lib/python3.5/plat-x86_64-linux-gnu:/usr/lib/python3.5/lib-dynload:/usr/local/lib/python3.5/dist-packages:/usr/lib/python3/dist-packages
algo-1-4cuqr_1  | SM_MODULE_NAME=train_model
algo-1-4cuqr_1  | SM_LOG_LEVEL=10
algo-1-4cuqr_1  | SM_CHANNELS=["train"]
algo-1-4cuqr_1  | SM_HP_PARA_C=1.0
algo-1-4cuqr_1  | SM_CURRENT_HOST=algo-1-4cuqr
algo-1-4cuqr_1  | SM_FRAMEWORK_PARAMS={}
algo-1-4cuqr_1  | SM_INPUT_CONFIG_DIR=/opt/ml/input/config
algo-1-4cuqr_1  | SM_OUTPUT_DIR=/opt/ml/output
algo-1-4cuqr_1  | SM_HP_PARA_GAMMA=0.01
algo-1-4cuqr_1  | SM_HPS={"para_c":1.0,"para_gamma":0.01}
algo-1-4cuqr_1  | SM_FRAMEWORK_MODULE=sagemaker_sklearn_container.training:main
algo-1-4cuqr_1  | SM_OUTPUT_INTERMEDIATE_DIR=/opt/ml/output/intermediate
algo-1-4cuqr_1  | SM_USER_ARGS=["--para_c","1.0","--para_gamma","0.01"]
algo-1-4cuqr_1  | SM_NUM_CPUS=4
algo-1-4cuqr_1  | SM_OUTPUT_DATA_DIR=/opt/ml/output/data
algo-1-4cuqr_1  | SM_MODULE_DIR=s3://sagemaker-ap-northeast-1-130747742019/sagemaker-scikit-learn-2019-06-12-23-57-07-460/source/sourcedir.tar.gz
algo-1-4cuqr_1  | SM_RESOURCE_CONFIG={"current_host":"algo-1-4cuqr","hosts":["algo-1-4cuqr"]}
algo-1-4cuqr_1  | SM_HOSTS=["algo-1-4cuqr"]
algo-1-4cuqr_1  | SM_INPUT_DIR=/opt/ml/input
algo-1-4cuqr_1  | SM_CHANNEL_TRAIN=/opt/ml/input/data/train
algo-1-4cuqr_1  | SM_NUM_GPUS=0
algo-1-4cuqr_1  | SM_USER_ENTRY_POINT=train_model.py
algo-1-4cuqr_1  | SM_TRAINING_ENV={"additional_framework_parameters":{},"channel_input_dirs":{"train":"/opt/ml/input/data/train"},"current_host":"algo-1-4cuqr","framework_module":"sagemaker_sklearn_container.training:main","hosts":["algo-1-4cuqr"],"hyperparameters":{"para_c":1.0,"para_gamma":0.01},"input_config_dir":"/opt/ml/input/config","input_data_config":{"train":{"TrainingInputMode":"File"}},"input_dir":"/opt/ml/input","job_name":"sagemaker-scikit-learn-2019-06-12-23-57-07-460","log_level":10,"model_dir":"/opt/ml/model","module_dir":"s3://sagemaker-ap-northeast-1-130747742019/sagemaker-scikit-learn-2019-06-12-23-57-07-460/source/sourcedir.tar.gz","module_name":"train_model","network_interface_name":"ethwe","num_cpus":4,"num_gpus":0,"output_data_dir":"/opt/ml/output/data","output_dir":"/opt/ml/output","output_intermediate_dir":"/opt/ml/output/intermediate","resource_config":{"current_host":"algo-1-4cuqr","hosts":["algo-1-4cuqr"]},"user_entry_point":"train_model.py"}
algo-1-4cuqr_1  | SM_INPUT_DATA_CONFIG={"train":{"TrainingInputMode":"File"}}
algo-1-4cuqr_1  | SM_NETWORK_INTERFACE_NAME=ethwe
algo-1-4cuqr_1  | SM_MODEL_DIR=/opt/ml/model
algo-1-4cuqr_1  |
algo-1-4cuqr_1  | Invoking script with the following command:
algo-1-4cuqr_1  |
algo-1-4cuqr_1  | /usr/bin/python3 -m train_model --para_c 1.0 --para_gamma 0.01
algo-1-4cuqr_1  |
algo-1-4cuqr_1  |
algo-1-4cuqr_1  | 2019-06-12 23:57:13,528 sagemaker-containers INFO     Reporting training SUCCESS
tmpri2jv76o_algo-1-4cuqr_1 exited with code 0
Aborting on container exit...
===== Job Complete =====
  • Created Temp Files
Temp
├─tmpzg964wnz
│      iris_train.csv
│
└─tmpri2jv76o
    │  docker-compose.yaml
    │  
    ├─artifacts
    ├─model
    │      model.joblib
    │      
    └─output
      └─data
  • docker-compose.yaml
networks:
  sagemaker-local:
    name: sagemaker-local
services:
  algo-1-4cuqr:
    command: train
    environment:
    - AWS_ACCESS_KEY_ID=XXXXXXXXXXXXX
    - AWS_SECRET_ACCESS_KEY=XXXXXXXXXXXXX
    - AWS_REGION=ap-northeast-1
    - TRAINING_JOB_NAME=sagemaker-scikit-learn-2019-06-12-23-57-07-460
    - HTTP_PROXY=http://username:password@foobar:port
    - HTTPS_PROXY=http://username:password@foobar:port
    image: 354813040037.dkr.ecr.ap-northeast-1.amazonaws.com/sagemaker-scikit-learn:0.20.0-cpu-py3
    networks:
      sagemaker-local:
        aliases:
        - algo-1-4cuqr
    stdin_open: true
    tty: true
    volumes:
    - C:\Users\username\AppData\Local\Temp\tmpri2jv76o\algo-1-4cuqr\output:/opt/ml/output
    - C:\Users\username\AppData\Local\Temp\tmpri2jv76o\algo-1-4cuqr\input:/opt/ml/input
    - C:\Users\username\AppData\Local\Temp\tmpri2jv76o\algo-1-4cuqr\output/data:/opt/ml/output/data
    - C:\Users\username\AppData\Local\Temp\tmpri2jv76o\model:/opt/ml/model
    - C:\Users\username\AppData\Local\Temp\tmpzg964wnz:/opt/ml/input/data/train
version: '2.3'

Deploying

  • Logs
Attaching to tmpdbo8bbod_algo-1-qlg3c_1
algo-1-qlg3c_1  | [2019-06-12 23:57:20 +0000] [16] [INFO] Starting gunicorn 19.9.0
algo-1-qlg3c_1  | [2019-06-12 23:57:20 +0000] [16] [INFO] Listening at: unix:/tmp/gunicorn.sock (16)
algo-1-qlg3c_1  | [2019-06-12 23:57:20 +0000] [16] [INFO] Using worker: gevent
algo-1-qlg3c_1  | [2019-06-12 23:57:20 +0000] [23] [INFO] Booting worker with pid: 23
algo-1-qlg3c_1  | [2019-06-12 23:57:21 +0000] [27] [INFO] Booting worker with pid: 27
algo-1-qlg3c_1  | [2019-06-12 23:57:21 +0000] [29] [INFO] Booting worker with pid: 29
algo-1-qlg3c_1  | [2019-06-12 23:57:21 +0000] [33] [INFO] Booting worker with pid: 33
algo-1-qlg3c_1  | 2019-06-12 23:57:21,219 INFO - sagemaker-containers - No GPUs detected (normal if no gpus installed)
algo-1-qlg3c_1  | 2019-06-12 23:57:21,465 INFO - sagemaker-containers - Module train_model does not provide a setup.py.
algo-1-qlg3c_1  | Generating setup.py
algo-1-qlg3c_1  | 2019-06-12 23:57:21,465 INFO - sagemaker-containers - Generating setup.cfg
algo-1-qlg3c_1  | 2019-06-12 23:57:21,465 INFO - sagemaker-containers - Generating MANIFEST.in
algo-1-qlg3c_1  | 2019-06-12 23:57:21,465 INFO - sagemaker-containers - Installing module with the following command:
algo-1-qlg3c_1  | /usr/bin/python3 -m pip install -U .
algo-1-qlg3c_1  | Processing /opt/ml/code
algo-1-qlg3c_1  | Building wheels for collected packages: train-model
algo-1-qlg3c_1  |   Running setup.py bdist_wheel for train-model ... done
algo-1-qlg3c_1  |   Stored in directory: /tmp/pip-ephem-wheel-cache-4x4jzvi4/wheels/35/24/16/37574d11bf9bde50616c67372a334f94fa8356bc7164af8ca3
algo-1-qlg3c_1  | Successfully built train-model
algo-1-qlg3c_1  | Installing collected packages: train-model
algo-1-qlg3c_1  | Successfully installed train-model-1.0.0
algo-1-qlg3c_1  | You are using pip version 18.1, however version 19.1.1 is available.
algo-1-qlg3c_1  | You should consider upgrading via the 'pip install --upgrade pip' command.
algo-1-qlg3c_1  | [2019-06-12 23:57:22 +0000] [23] [ERROR] Error handling request /ping
algo-1-qlg3c_1  | Traceback (most recent call last):
algo-1-qlg3c_1  |   File "/usr/local/lib/python3.5/dist-packages/sagemaker_containers/_functions.py", line 84, in wrapper
algo-1-qlg3c_1  |     return fn(*args, **kwargs)
algo-1-qlg3c_1  |   File "/usr/local/lib/python3.5/dist-packages/train_model.py", line 53, in model_fn
algo-1-qlg3c_1  |     model = joblib.load(os.path.join(model_dir, MODEL_NAME))
algo-1-qlg3c_1  |   File "/usr/local/lib/python3.5/dist-packages/sklearn/externals/joblib/numpy_pickle.py", line 590, in load
algo-1-qlg3c_1  |     with open(filename, 'rb') as f:
algo-1-qlg3c_1  | FileNotFoundError: [Errno 2] No such file or directory: '/opt/ml/model/model.joblib'
algo-1-qlg3c_1  |
algo-1-qlg3c_1  | During handling of the above exception, another exception occurred:
algo-1-qlg3c_1  |
algo-1-qlg3c_1  | Traceback (most recent call last):
algo-1-qlg3c_1  |   File "/usr/local/lib/python3.5/dist-packages/gunicorn/workers/base_async.py", line 56, in handle
algo-1-qlg3c_1  |     self.handle_request(listener_name, req, client, addr)
algo-1-qlg3c_1  |   File "/usr/local/lib/python3.5/dist-packages/gunicorn/workers/ggevent.py", line 160, in handle_request
algo-1-qlg3c_1  |     addr)
algo-1-qlg3c_1  |   File "/usr/local/lib/python3.5/dist-packages/gunicorn/workers/base_async.py", line 107, in handle_request
algo-1-qlg3c_1  |     respiter = self.wsgi(environ, resp.start_response)
algo-1-qlg3c_1  |   File "/usr/local/lib/python3.5/dist-packages/sagemaker_sklearn_container/serving.py", line 103, in main
algo-1-qlg3c_1  |     user_module_transformer.initialize()
algo-1-qlg3c_1  |   File "/usr/local/lib/python3.5/dist-packages/sagemaker_containers/_transformer.py", line 157, in initialize
algo-1-qlg3c_1  |     self._model = self._model_fn(_env.model_dir)
algo-1-qlg3c_1  |   File "/usr/local/lib/python3.5/dist-packages/sagemaker_containers/_functions.py", line 86, in wrapper
algo-1-qlg3c_1  |     six.reraise(error_class, error_class(e), sys.exc_info()[2])
algo-1-qlg3c_1  |   File "/usr/local/lib/python3.5/dist-packages/six.py", line 692, in reraise
algo-1-qlg3c_1  |     raise value.with_traceback(tb)
algo-1-qlg3c_1  |   File "/usr/local/lib/python3.5/dist-packages/sagemaker_containers/_functions.py", line 84, in wrapper
algo-1-qlg3c_1  |     return fn(*args, **kwargs)
algo-1-qlg3c_1  |   File "/usr/local/lib/python3.5/dist-packages/train_model.py", line 53, in model_fn
algo-1-qlg3c_1  |     model = joblib.load(os.path.join(model_dir, MODEL_NAME))
algo-1-qlg3c_1  |   File "/usr/local/lib/python3.5/dist-packages/sklearn/externals/joblib/numpy_pickle.py", line 590, in load
algo-1-qlg3c_1  |     with open(filename, 'rb') as f:
algo-1-qlg3c_1  | sagemaker_containers._errors.ClientError: [Errno 2] No such file or directory: '/opt/ml/model/model.joblib'
algo-1-qlg3c_1  | 172.18.0.1 - - [12/Jun/2019:23:57:22 +0000] "GET /ping HTTP/1.1" 500 141 "-" "-"
  • Created Temp Files
Temp
├─tmpgfpx4a40       # Why is it empty
│                                     
└─tmpdbo8bbod
    │  docker-compose.yaml
    │  
    └─algo-1-qlg3c
  • docker-compose.yaml
networks:
  sagemaker-local:
    name: sagemaker-local
services:
  algo-1-qlg3c:
    command: serve
    environment:
    - AWS_ACCESS_KEY_ID=XXXXXXXXXXXXX
    - AWS_SECRET_ACCESS_KEY=XXXXXXXXXXXXX
    - HTTP_PROXY=http://username:password@foobar:port
    - HTTPS_PROXY=http://username:password@foobar:port
    - SAGEMAKER_PROGRAM=train_model.py
    - SAGEMAKER_SUBMIT_DIRECTORY=s3://sagemaker-ap-northeast-1-XXXXXXXXXX/sagemaker-scikit-learn-2019-06-12-23-57-07-460/sourcedir.tar.gz
    - SAGEMAKER_ENABLE_CLOUDWATCH_METRICS=false
    - SAGEMAKER_CONTAINER_LOG_LEVEL=10
    - SAGEMAKER_REGION=ap-northeast-1
    image: 354813040037.dkr.ecr.ap-northeast-1.amazonaws.com/sagemaker-scikit-learn:0.20.0-cpu-py3
    networks:
      sagemaker-local:
        aliases:
        - algo-1-qlg3c
    ports:
    - 8080:8080
    stdin_open: true
    tty: true
    volumes:
    - C:\Users\username\AppData\Local\Temp\tmpgfpx4a40:/opt/ml/model
version: '2.3'

Exact command to reproduce

sklearn = SKLearn(
    entry_point=script_path,
    train_instance_type='local',
    source_dir='./',
    role=role,
    py_version='py3',
    container_log_level=10,
    hyperparameters={'para_c': 1.0, 'para_gamma': 0.01})
sklearn.fit({'train': train_input})

predictor = sklearn.deploy(initial_instance_count=1,
                           instance_type='local',
                           env=env)

Cause

I know that the cause is that the tmpgfpx4a40 folder linked to the /opt/ml/model folder is empty, as described in the docker-compose.yaml file. And I think the tmpgfpx4a40 folder needs a model.joblib file.
However, I don't know why the tmpgfpx4a40 folder is empty and I don't know how to avoid it.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions