NFS Performance Test with Amazon EFS
cpio를 이런데 쓸줄이야. (사실 검색해보기 전까지는 backup용으로 쓰이던걸 몰랐다. Android fs 빼낼때나 썻지...)
fpart + cpio + GNU Parallel 은 참신하다. Amazon EFS 외에도 잘 써먹을 수 있겠다.
참고 : https://github.com/aws-samples/amazon-efs-tutorial
원문 : https://s3.amazonaws.com/aws-us-east-1/demo/efs-parallel-file-transfer-test.html
Amazon EFS Parallel File Transfer Test
AWS Storage Days | New York | September 6-8, 2017
Version 1.0
© 2017 Amazon Web Services, Inc. and its affiliates. All rights reserved. This work may not be reproduced or redistributed, in whole or in part, without prior written permission from Amazon Web Services, Inc. Commercial copying, lending, or selling is prohibited.
Errors or corrections? Email us at darrylo@amazon.com.
Step-by-step Guide
Launch this environment to evaluate how the different instances types, I/O size, and thread count effects throughput to an Amazon EFS file system.
AWS CloudFormation template will launch:
- Three Auto Scaling groups
- Recommend using default instance types for each Auto Scaling group
- t2.micro for Auto Scaling group 0
- m4.large for Auto Scaling group 1
- c4.xlarge for Auto Scaling group 2
- Minimum and desired size for each Auto Scaling group is 1 (maximum 4)
- Each Auto Scaling group instance will auto mount the identified Amazon EFS file system, generate 5GB of 1MB files & 5GB of 10MB files using smallfile, and install the following applications:
- nload - is a console application which monitors network traffic and bandwidth usage in real time
- smallfile - https://github.com/bengland2/smallfile - used to generate test data; Developer: Ben England
- GNU Parallel - https://www.gnu.org/software/parallel/ - used to parallelize single-threaded commands; O. Tange (2011): GNU Parallel - The Command-Line Power Tool, ;login: The USENIX Magazine, February 2011:42-47
- Mutil mcp - https://github.com/pkolano/mutil - multi-threaded drop-in replacement of cp; Author Paul Kolano (NASA)
- fpart - https://github.com/martymac/fpart - sorts file trees and packs them into partitions; Author Ganaël Laplanche
- fpsync - wraps fpart + rsync - included in the tools/ directory of fpart
NOTICE!! Amazon Web Services does NOT endorse specific 3rd party applications. These software packages are used for demonstration purposes only. Follow all expressed or implied license agreements associated with these 3rd party software products.
WARNING!! If you build the above mentioned environment, this will exceed your free-usage tier. You will incur charges as a result of creating this environment and running these scripts in your AWS account. If you run this environment for 1 hour, you may incur a charge of ~$1.18.
You can launch this CloudFormation stack, using your account, in the following AWS Regions:
AWS Region Code | Name | Launch |
---|---|---|
us-east-1 | US East (N. Virginia) | |
us-east-2 | US East (Ohio) | |
us-west-2 | US West (Oregon) | |
eu-west-1 | EU (Ireland) | |
eu-central-1 | EU (Frankfurt) | |
ap-southeast-2 | AP (Sydney) |
SSH to all three EC2 instances
Not all EC2 instances are created equal
Run this command against t2.micro
1. Write 17GB to EFS w/ 1MB block size
time dd if=/dev/zero of=/efs/dd/17G-dd-$(date +%Y%m%d%H%M%S.%3N).img bs=1M count=17408 conv=fsync &
nload -u M
While this is running, continue and run Step 2 in a separate terminal session.
Run this command against m4.large
2. Write 5GB to EFS w/ 1MB block size
time dd if=/dev/zero of=/efs/dd/5G-dd-$(date +%Y%m%d%H%M%S.%3N).img bs=1M count=5120 conv=fsync &
nload -u M
Maximize throughput using larger I/O size
Run the remaining commands against c4.xlarge
3. Write 2GB to EBS w/ 1MB block size - ‘sync’ once at the end
time dd if=/dev/zero of=/ebs/dd/2G-dd-$(date +%Y%m%d%H%M%S.%3N).img bs=1M count=2048 status=progress conv=fsync
Record run time.
4. Write 2GB to EFS w/ 1MB block size - ‘sync’ once at the end
time dd if=/dev/zero of=/efs/dd/2G-dd-$(date +%Y%m%d%H%M%S.%3N).img bs=1M count=2048 status=progress conv=fsync
Record run time.
5. Write 2GB to EBS w/ 8MB block size - ‘sync’ once at the end
time dd if=/dev/zero of=/ebs/dd/2G-dd-$(date +%Y%m%d%H%M%S.%3N).img bs=8M count=256 status=progress conv=fsync
Record run time.
6. Write 2GB to EFS w/ 8MB block size - ‘sync’ once at the end
time dd if=/dev/zero of=/efs/dd/2G-dd-$(date +%Y%m%d%H%M%S.%3N).img bs=8M count=256 status=progress conv=fsync
Record run time.
Sample run times
Step & Command | Duration |
---|---|
3. Write 2GB to EBS w/ 1MB block size - ‘sync’ once at the end | 22 seconds |
4. Write 2GB to EFS w/ 1MB block size - ‘sync’ once at the end | 12 seconds |
5. Write 2GB to EBS w/ 8MB block size - ‘sync’ once at the end | 22 seconds |
6. Write 2GB to EFS w/ 8MB block size - ‘sync’ once at the end | 12 seconds |
7. Write 2GB to EBS w/ 1MB block size - ‘sync’ after each block is written
time dd if=/dev/zero of=/ebs/dd/2G-dd-$(date +%Y%m%d%H%M%S.%3N).img bs=1M count=2048 status=progress oflag=sync
Record run time.
8. Write 2GB to EFS w/ 1MB block size - ‘sync’ after each block is written
time dd if=/dev/zero of=/efs/dd/2G-dd-$(date +%Y%m%d%H%M%S.%3N).img bs=1M count=2048 status=progress oflag=sync
Record run time.
9. Write 2GB to EBS w/ 8MB block size - ‘sync’ after each block is written
time dd if=/dev/zero of=/ebs/dd/2G-dd-$(date +%Y%m%d%H%M%S.%3N).img bs=8M count=256 status=progress oflag=sync
Record run time.
10. Write 2GB to EFS w/ 8MB block size - ‘sync’ after each block is written
time dd if=/dev/zero of=/efs/dd/2G-dd-$(date +%Y%m%d%H%M%S.%3N).img bs=8M count=256 status=progress oflag=sync
Record run time.
Sample run times
Step & Command | Duration |
---|---|
7. Write 2GB to EBS w/ 1MB block size - ‘sync’ after each block is written | 22 seconds |
8. Write 2GB to EFS w/ 1MB block size - ‘sync’ after each block is written | 1 minute 43 seconds |
9. Write 2GB to EBS w/ 8MB block size - ‘sync’ after each block is written | 22 seconds |
10. Write 2GB to EFS w/ 8MB block size - ‘sync’ after each block is written | 48 seconds |
Maximize throughput using parallel, multi-threaded access
Run the remaining commands against c4.xlarge
11. Write 2GB to EBS (4 threads of 512MB each) w/ 1MB block size - ‘sync’ after each block is written
time seq 0 3 | parallel --will-cite -j 4 'dd if=/dev/zero of=/ebs/dd/2G-dd-$(date +%Y%m%d%H%M%S.%3N)-{}.img bs=1M count=512 oflag=sync'
Record run time.
12. Write 2GB to EFS (4 threads of 512MB each) w/ 1MB block size - ‘sync’ after each block is written
time seq 0 3 | parallel --will-cite -j 4 'dd if=/dev/zero of=/efs/dd/2G-dd-$(date +%Y%m%d%H%M%S.%3N)-{}.img bs=1M count=512 oflag=sync'
Record run time.
13. Write 2GB to EFS (4 threads of 512MB each) w/ 1MB block size - ‘sync’ once at the end
time seq 0 3 | parallel --will-cite -j 4 'dd if=/dev/zero of=/efs/dd/2G-dd-$(date +%Y%m%d%H%M%S.%3N)-{}.img bs=1M count=512 conv=fsync'
Record run time.
14. Write 2GB to EBS (8 threads of 256MB each) w/ 1MB block size - ‘sync’ after each block is written
time seq 0 7 | parallel --will-cite -j 8 'dd if=/dev/zero of=/ebs/dd/2G-dd-$(date +%Y%m%d%H%M%S.%3N)-{}.img bs=1M count=256 oflag=sync'
Record run time.
15. Write 2GB to EFS (8 threads of 256MB each) w/ 1MB block size - ‘sync’ after each block is written
time seq 0 7 | parallel --will-cite -j 8 'dd if=/dev/zero of=/efs/dd/2G-dd-$(date +%Y%m%d%H%M%S.%3N)-{}.img bs=1M count=256 oflag=sync'
Record run time.
16. Write 2GB to EFS (8 threads of 256MB each) w/ 1MB block size - ‘sync’ once at the end
time seq 0 7 | parallel --will-cite -j 8 'dd if=/dev/zero of=/efs/dd/2G-dd-$(date +%Y%m%d%H%M%S.%3N)-{}.img bs=1M count=256 conv=fsync'
Record run time.
Sample run times
Step & Command | Duration | Throughput |
---|---|---|
11. Write 2GB to EBS (4 threads of 512MB each) w/ 1MB block size - ‘sync’ after each block is written | 22 seconds | ~90 MB/s |
12. Write 2GB to EFS (4 threads of 512MB each) w/ 1MB block size - ‘sync’ after each block is written | 26 seconds | ~77 MB/s |
13. Write 2GB to EFS (4 threads of 512MB each) w/ 1MB block size - ‘sync’ once at the end | 12 seconds | ~167 MB/s |
14. Write 2GB to EBS (8 threads of 256MB each) w/ 1MB block size - ‘sync’ after each block is written | 22 seconds | ~90 MB/s |
15. Write 2GB to EFS (8 threads of 256MB each) w/ 1MB block size - ‘sync’ after each block is written | 14 seconds | ~143 MB/s |
16. Write 2GB to EFS (8 threads of 256MB each) w/ 1MB block size - ‘sync’ once at the end | 12 seconds | ~167 MB/s |
Maximize throughput - EFS parallel file transfer test
Run the remaining commands against c4.xlarge
Identify size of data set to be transferred.
du -csh /ebs/data-1m/
find /ebs/data-1m/. -type f | wc -l
Set variable
instanceid=$(curl -s http://169.254.169.254/latest/meta-data/instance-id)
17. Transfer files from EBS to EFS using rsync
Drop caches.sudo su
sync && echo 3 > /proc/sys/vm/drop_caches
exit
time rsync -r /ebs/data-1m/ /efs/rsync/${instanceid} &
nload -u M
Record throughput.
18. Transfer files from EBS to EFS using cp
Drop caches.sudo su
sync && echo 3 > /proc/sys/vm/drop_caches
exit
time cp -r /ebs/data-1m/* /efs/cp/${instanceid} &
nload -u M
Record throughput.
Set variable
Set the threads variable to 4 threads per vcpu.threads=$(($(nproc --all) * 4))
19. Transfer files from EBS to EFS using fpsync
Drop caches.sudo su
sync && echo 3 > /proc/sys/vm/drop_caches
exit
time /usr/local/bin/fpsync -n ${threads} -v /ebs/data-1m/ /efs/fpsync/${instanceid} &
nload -u M
Record throughput.
20. Transfer files from EBS to EFS using mcp
Drop caches.sudo su
sync && echo 3 > /proc/sys/vm/drop_caches
exit
time mcp -r --threads=${threads} /ebs/data-1m/* /efs/mcp/${instanceid} &
nload -u M
Record throughput.
21. Transfer files from EBS to EFS using efscp script (cp + GNU Parallel)
Drop caches.sudo su
sync && echo 3 > /proc/sys/vm/drop_caches
exit
time /home/ec2-user/efscp.sh /ebs/data-1m/ /efs/efscp ${threads} &
nload -u M
Record throughput.
22. Transfer files from EBS to EFS using fpart + cpio + GNU Parallel
Drop caches.sudo su
sync && echo 3 > /proc/sys/vm/drop_caches
exit
time /usr/local/bin/fpart -Z -n 1 -o /home/ec2-user/fpart-files-to-transfer /ebs/data-1m
time parallel --will-cite -j ${threads} --pipepart --round-robin --block 1M -a /home/ec2-user/fpart-files-to-transfer.0 'sudo cpio -pdm {} /efs/parallelcpio/${instanceid}/' &
nload -u M
Record throughput.
Sample run times
Step & Command | Duration | Throughput |
---|---|---|
17. Transfer 5000 ~1MB files from EBS to EFS using rsync | 10 minutes 3 seconds | ~8.3 MB/s |
18. Transfer 5000 ~1MB files from EBS to EFS using cp | 7 minutes 55 seconds | ~10.5 MB/s |
19. Transfer 5000 ~1MB files from EBS to EFS using fpsync | 4 minutes 38 seconds | ~18.6 MB/s |
20. Transfer 5000 ~1MB files from EBS to EFS using mcp | 1 minute 40 seconds | ~50.0 MB/s |
21. Transfer 5000 ~1MB files from EBS to EFS using cp + GNU Parallel | 1 minute 27 seconds | ~57.5 MB/s |
22. Transfer 5000 ~1MB files from EBS to EFS using fpart + cpio + GNU Parallel | 1 minute 4 seconds | ~78.0 MB/s |
Re-run steps 17-22, changing the source path from /ebs/data-1m to /ebs/data-10m to compare the throughput differences between small and large I/O size.
Cleanup
Delete all files on the EFS file system that were created using these scripts and delete the CloudFormation stack, so you don’t continue to incur additional charges for these resources.
Conclusion
The distributed nature of Amazon EFS enables high levels of availability, durability, and scalability. This distributed architecture results in a small latency overhead for each file operation. Due to this per-operation latency, overall throughput generally increases as the average I/O size increases, because the overhead is amortized over a larger amount of data. Amazon EFS supports highly parallelized workloads (for example, using concurrent operations from multiple threads and multiple Amazon EC2 instances), which enables high levels of aggregate throughput and operations per second.
For feedback, suggestions, or corrections, please email me at darrylo@amazon.com.