Re: ZSTD Project Weekly Status Update

classic Classic list List threaded Threaded
12 messages Options
Reply | Threaded
Open this post in threaded view
|

Re: ZSTD Project Weekly Status Update

Allan Jude-9
In my continued effort to finish the integration of ZSTD into ZFS, here
is my second weekly status report:

There is still a number of existing feedback items and known issues that
need to be addressed. I am trying to work through those now.

https://github.com/openzfs/zfs/commit/622b7e1e50ab667a6af1685245a2f5a8d5e9bff3
- Addressed an issue where the user could manually set the hidden
compress_level property, causing incorrect operation. The property is
not marked with the PROP_READONLY flag because it requires PROP_INHERIT

- It has been pointed out again recently that setting the compress=zstd
property on a dataset, but not actually writing any blocks, does not set
the zstd feature flag to 'active'. If this pool is then exported, and
imported using an older version of ZFS that does not know of zstd, it
will trigger an ASSERT() when the value of the compression property enum
is out-of-range. The plan is to fully activate the feature when the
property is set, but the details of how (and where) to do still still
need to be worked out.


- I am still working on the issue of inheritance for both the compress
and the hidden compress_level properties. If you create a child dataset
with the compress property set to zstd-13, it works as expected. But if
you `zfs inherit compress` on that dataset, the output of `zfs get
compress,compress_level` changes from:

zof/inherit/zstd-10/zstd-13       compression     zstd-13
            local
zof/inherit/zstd-10/zstd-13       compress_level  13
            local

to:

zof/inherit/zstd-10/zstd-13       compression     zstd-13
            inherited from zof/inherit/zstd-10
zof/inherit/zstd-10/zstd-13       compress_level  13
            local

This is due to the fact that both the parent, and the child, actually
have compress=16 (zstd), and the zstd-10 or zstd-13 is determined by
combining the hidden compress_level property.

The expected behaviour in this case would be that the compression type
(and therefore the compress_level) would get reset to the value from the
parent.

There is a related problem when the child's compression setting is set
to lz4 (or any other type that doesn't use a level).

This project was sponsored by the FreeBSD Foundation.

--
Allan Jude




signature.asc (851 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: ZSTD Project Weekly Status Update

Allan Jude-9
In my continuing effort to complete the integration of ZSTD into
OpenZFS, here is my third weekly status report:

https://github.com/allanjude/zfs/commit/87bb538bdbc4bb8848ed5791b7b0de84a026ebbe
- Completed the rewrite of the way the compression property is handled,
moving away from the initial approach of storing the compression
property (enum zio_compress) and the level (uint64_t) separately.

Previously we exposed the list of compression algorithms and levels to
userland as the corresponding value from the enum in the lower 7 bits,
and the level in the remaining upper bits. Then, as part of the property
GET and SET IOCTLs, we read the separate compression= and
compress_level= properties from the ZAP and returned the combined value,
or split the combined value into those two separate properties. This
worked but caused a lot of headache around property inheritance.

Instead I've changed to doing the combine/split when reading/writing
from the dataset properties ZAP, via the compression_changed_cb()
function. So the properties ZAP contains the combined value (lower 7
bits are the compression algorithm, as defined in the enum zio_compress,
and the upper bits are the compression level). Elsewhere in ZFS we keep
the two values separate (os_compress and os_complevel, and related
variables in all of the different parts of ZFS).

So now, inheritance of the property is handled correctly, and avoids
issues where a dataset with compression=zstd-12, would say 'inherited
from' a dataset with zstd at some other compression level (since both
actually just had compression=zstd, but different compress_level= values).


I have also further extended zdb to inspect the compression settings
when looking at an object:
https://github.com/allanjude/zfs/commit/3fef3c83b8ce90149110ed989bd9fd3e289798e0


I am still working on a solution for setting the zstd feature flag to
'active' as soon as it is set, rather than only once a block is born.


Additionally, I am investigating how to best handle the fact that
embedded block pointers compressed with ZSTD will make 'zfs send -e'
streams backwards incompatible, without a way for the user to opt-out of
sending a stream that contains zstd compressed blocks that the receiving
side may not be able to read. The same can be said for 'zfs send -c' as
well. I am open to ideas on how best to handle this. I have thought
about only sending ZSTD compressed blocks if the user specifies the -Z
(--zstd) flag, but this can lead to confusion where using -c without -Z
would have to either error out, or send the ZSTD compressed blocks
uncompressed.


This project is sponsored by the FreeBSD Foundation.

--
Allan Jude


signature.asc (851 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: ZSTD Project Weekly Status Update

Allan Jude-9
In my continuing effort to complete the integration of ZSTD into
OpenZFS, here is my fourth weekly status report:

https://github.com/allanjude/zfs/commit/b0b1270d4e7835ecff413208301375e3de2a4153
- Create a new test case to make sure that the ZSTD header we write
along with the data is correct. Verify that the physical size of the
compressed data is less than the psize for the block pointer, and verify
that the level matches. It uses a random level between 1 and 19 and then
verifies with zdb that the block was compressed with that level.

I am still working on a solution for setting the zstd feature flag to
'active' as soon as it is set, rather than only once a block is born. As
well as fixing up compatibility around zfs send/recv with the embedded
block points flag.

This project is sponsored by the FreeBSD Foundation.


On 2020-07-06 20:07, Allan Jude wrote:

> In my continuing effort to complete the integration of ZSTD into
> OpenZFS, here is my third weekly status report:
>
> https://github.com/allanjude/zfs/commit/87bb538bdbc4bb8848ed5791b7b0de84a026ebbe
> - Completed the rewrite of the way the compression property is handled,
> moving away from the initial approach of storing the compression
> property (enum zio_compress) and the level (uint64_t) separately.
>
> Previously we exposed the list of compression algorithms and levels to
> userland as the corresponding value from the enum in the lower 7 bits,
> and the level in the remaining upper bits. Then, as part of the property
> GET and SET IOCTLs, we read the separate compression= and
> compress_level= properties from the ZAP and returned the combined value,
> or split the combined value into those two separate properties. This
> worked but caused a lot of headache around property inheritance.
>
> Instead I've changed to doing the combine/split when reading/writing
> from the dataset properties ZAP, via the compression_changed_cb()
> function. So the properties ZAP contains the combined value (lower 7
> bits are the compression algorithm, as defined in the enum zio_compress,
> and the upper bits are the compression level). Elsewhere in ZFS we keep
> the two values separate (os_compress and os_complevel, and related
> variables in all of the different parts of ZFS).
>
> So now, inheritance of the property is handled correctly, and avoids
> issues where a dataset with compression=zstd-12, would say 'inherited
> from' a dataset with zstd at some other compression level (since both
> actually just had compression=zstd, but different compress_level= values).
>
>
> I have also further extended zdb to inspect the compression settings
> when looking at an object:
> https://github.com/allanjude/zfs/commit/3fef3c83b8ce90149110ed989bd9fd3e289798e0
>
>
> I am still working on a solution for setting the zstd feature flag to
> 'active' as soon as it is set, rather than only once a block is born.
>
>
> Additionally, I am investigating how to best handle the fact that
> embedded block pointers compressed with ZSTD will make 'zfs send -e'
> streams backwards incompatible, without a way for the user to opt-out of
> sending a stream that contains zstd compressed blocks that the receiving
> side may not be able to read. The same can be said for 'zfs send -c' as
> well. I am open to ideas on how best to handle this. I have thought
> about only sending ZSTD compressed blocks if the user specifies the -Z
> (--zstd) flag, but this can lead to confusion where using -c without -Z
> would have to either error out, or send the ZSTD compressed blocks
> uncompressed.
>
>
> This project is sponsored by the FreeBSD Foundation.
>

--
Allan Jude


signature.asc (851 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: ZSTD Project Weekly Status Update

Allan Jude-9
This is the fifth weekly status report on the project to integrate ZSTD
into OpenZFS.

https://github.com/c0d3z3r0/zfs/pull/14/commits/9807c99169e5931a754bb0df68267ffa2f289474
- Created a new test case to ensure that ZSTD compressed blocks survive
replication with the -c flag. We wanted to make sure the on-disk
compression header survived the trip.

https://github.com/c0d3z3r0/zfs/pull/14/commits/94bef464fc304e9d6f5850391e41720c3955af11
- I split the zstd.c file into OS specific bits
(module/os/{linux,freebsd}/zstd_os.c) and also split the .h file into
zstd.h and zstd_impl.h. This was done so that FreeBSD can use the
existing kmem_cache mechanism, while Linux can use the vmem_alloc pool
created in the earlier versions of this patch. I significantly changed
the FreeBSD implementation from my earlier work, to reuse the power of 2
zio_data_buf_cache[]'s that already exist, only adding a few additional
kmem_caches for large blocks with high compression levels. This should
avoid creating as many unnecessary kmem caches.

https://github.com/c0d3z3r0/zfs/pull/14/commits/3d48243b77e6c8c3bf562c7a2315dd6cc571f28c
- Lastly, in my testing I was seeing a lot of hits on the new
compression failure kstat I added. This was caused by the ZFS "early
abort" feature, where we give the compressor an output buffer that is
smaller than the input, so it will fail if the block will not compress
enough to be worth it. This helps avoid wasting CPU on uncompressible
blocks. However, it seems the 'one-file' version of zstd we are using
does not expose the ZSTD_ErrorCode enum. This needs to be investigated
further to avoid issues if the value changes (although it is apparently
stable after version 1.3.1).

I am still working on a solution for zfs send stream compatibility. I am
leaning towards creating a new flag, --zstd, to enable ZSTD compressed
output. If the -e or -c flag are used without the --zstd flag, and the
dataset has the zstd feature active, the idea would be to emit a warning
but send the blocks uncompressed, so that the stream remains compatible
with older versions of ZFS. I will be discussing this on the OpenZFS
Leadership call tomorrow, and am open to suggestions on how to best
handle this.


On 2020-07-14 22:26, Allan Jude wrote:

> In my continuing effort to complete the integration of ZSTD into
> OpenZFS, here is my fourth weekly status report:
>
> https://github.com/allanjude/zfs/commit/b0b1270d4e7835ecff413208301375e3de2a4153
> - Create a new test case to make sure that the ZSTD header we write
> along with the data is correct. Verify that the physical size of the
> compressed data is less than the psize for the block pointer, and verify
> that the level matches. It uses a random level between 1 and 19 and then
> verifies with zdb that the block was compressed with that level.
>
> I am still working on a solution for setting the zstd feature flag to
> 'active' as soon as it is set, rather than only once a block is born. As
> well as fixing up compatibility around zfs send/recv with the embedded
> block points flag.
>
> This project is sponsored by the FreeBSD Foundation.
>
>
--
Allan Jude


signature.asc (851 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: ZSTD Project Weekly Status Update

Allan Jude-9
This is the sixth weekly status report on the project to integrate ZSTD
into OpenZFS.

https://github.com/openzfs/zfs/pull/10631 - Improved the `zfs recv`
error handling when it receives an out-of-bounds property value.
Specifically, if a zfs send stream is created that supports a newer
compression or checksum type, the property will fail to be set on the
receiving system. This is fine, but `zfs recv` would abort() and create
a core file, rather than reporting the error, because it did not
understand the EINVAL being returned for that property. In the case
where the property is outside the accepted range, we now return the new
ZFS_ERR_BADPROP value, and the correct message is displayed to the user.
I opted not to use ERANGE because that is used for 'this property value
should not be used on a root pool'. The idea is to get this fix merged
into the 0.8.x branch for the next point release, to improve
compatibility with streams generated by OpenZFS 2.0


https://github.com/openzfs/zfs/pull/10632 - General improvement to error
handling when the error code is EZFS_UNKNOWN.


https://github.com/allanjude/zfs/commit/8f37c1ad8edaff20a550b3df07995dab80c06492
- ZFS replication compatibility improvements. As discussed on the
leadership call earlier this month, keep the compatibility simple. If
the -c flag is given, send blocks compressed with any compression
algorithm. The improved error handling will let the user know if their
system can't handle ZSTD.


https://github.com/allanjude/zfs/commit/0ffd80e281f79652973378599cd0332172f365bd
- per-dataset feature activation. This switches the ZSTD feature flag
from 'enabled' to 'active' as soon as the property is set, instead of
when the first block is written. This ensures that the pool can't be
imported on a system that does not support ZSTD that will cause the ZFS
cli tools to panic.


I will be working on adding some tests for the feature activation.

I've been looking at ways to add tests for the replication changes, but
it doesn't seem to be easy to test the results of a 'zfs recv' that does
not know about ZSTD (where the values are outside of the valid range for
the enum). If anyone has any ideas here, I'd be very interested.


On 2020-07-20 23:40, Allan Jude wrote:

> This is the fifth weekly status report on the project to integrate ZSTD
> into OpenZFS.
>
> https://github.com/c0d3z3r0/zfs/pull/14/commits/9807c99169e5931a754bb0df68267ffa2f289474
> - Created a new test case to ensure that ZSTD compressed blocks survive
> replication with the -c flag. We wanted to make sure the on-disk
> compression header survived the trip.
>
> https://github.com/c0d3z3r0/zfs/pull/14/commits/94bef464fc304e9d6f5850391e41720c3955af11
> - I split the zstd.c file into OS specific bits
> (module/os/{linux,freebsd}/zstd_os.c) and also split the .h file into
> zstd.h and zstd_impl.h. This was done so that FreeBSD can use the
> existing kmem_cache mechanism, while Linux can use the vmem_alloc pool
> created in the earlier versions of this patch. I significantly changed
> the FreeBSD implementation from my earlier work, to reuse the power of 2
> zio_data_buf_cache[]'s that already exist, only adding a few additional
> kmem_caches for large blocks with high compression levels. This should
> avoid creating as many unnecessary kmem caches.
>
> https://github.com/c0d3z3r0/zfs/pull/14/commits/3d48243b77e6c8c3bf562c7a2315dd6cc571f28c
> - Lastly, in my testing I was seeing a lot of hits on the new
> compression failure kstat I added. This was caused by the ZFS "early
> abort" feature, where we give the compressor an output buffer that is
> smaller than the input, so it will fail if the block will not compress
> enough to be worth it. This helps avoid wasting CPU on uncompressible
> blocks. However, it seems the 'one-file' version of zstd we are using
> does not expose the ZSTD_ErrorCode enum. This needs to be investigated
> further to avoid issues if the value changes (although it is apparently
> stable after version 1.3.1).
>
> I am still working on a solution for zfs send stream compatibility. I am
> leaning towards creating a new flag, --zstd, to enable ZSTD compressed
> output. If the -e or -c flag are used without the --zstd flag, and the
> dataset has the zstd feature active, the idea would be to emit a warning
> but send the blocks uncompressed, so that the stream remains compatible
> with older versions of ZFS. I will be discussing this on the OpenZFS
> Leadership call tomorrow, and am open to suggestions on how to best
> handle this.
>
>
> On 2020-07-14 22:26, Allan Jude wrote:
>> In my continuing effort to complete the integration of ZSTD into
>> OpenZFS, here is my fourth weekly status report:
>>
>> https://github.com/allanjude/zfs/commit/b0b1270d4e7835ecff413208301375e3de2a4153
>> - Create a new test case to make sure that the ZSTD header we write
>> along with the data is correct. Verify that the physical size of the
>> compressed data is less than the psize for the block pointer, and verify
>> that the level matches. It uses a random level between 1 and 19 and then
>> verifies with zdb that the block was compressed with that level.
>>
>> I am still working on a solution for setting the zstd feature flag to
>> 'active' as soon as it is set, rather than only once a block is born. As
>> well as fixing up compatibility around zfs send/recv with the embedded
>> block points flag.
>>
>> This project is sponsored by the FreeBSD Foundation.
>>
>>
>

--
Allan Jude


signature.asc (851 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: ZSTD Project Weekly Status Update

Allan Jude-9
This is the seventh weekly status report on the project to integrate
ZSTD into OpenZFS.

The compatibility related changes I created last week were refined and
marged into the mainline branch.

Thanks to Brian Behlendorf for reviewing my proposed change for the zstd
feature flag activation, and pointing out a better approach. I have
reworked the patch based on his suggestion and prototype:

https://github.com/allanjude/zfs/commit/2508dafcec0a05d61afc5fbd5da356e201afbe97
- Activate the per-dataset ZSTD feature flag as soon as the property is
set to ZSTD. Before, simply doing `zfs set compression=zstd dataset`
would not activate the feature flag. The feature flag would be activated
when the first block that used ZSTD compression was written (see
dsl_dataset_block_born()). This meant that if you set the property,
exported the pool, the pool would import on systems with older versions
of ZFS that did not support ZSTD, but would crash their userspace tools,
because the property value was out of bounds.


https://github.com/allanjude/zfs/commit/b8bec3fd2a8feb3a4de572eb15515d3764f92a35
- I created a test that ensures that the feature flag is activated by
`zfs set compression=zstd` and also ensures that the feature flag
reverts to the 'enabled' state once the last dataset using zstd is
destroyed.


The next step is ensuring that ZSTD compression inter-operates properly
with the L2ARC and Encryption etc.

I've also been discussing ideas with Brian about future-proofing, to
handle the case where a newer version of ZSTD might compression the same
input differently (better ratio), and how that would impact L2ARC,
nop-write, etc. One idea (originally from Pawel Dawidek) is to do
something similar to what encryption does, and split the checksum field.
Using half to checksum the original data, and half the compressed
version. This would allow ZFS to detect when the same content compressed
differently (combined with the ZSTD version header in the compressed
data), giving better compatibility as we upgrade ZSTD.


This project is sponsored by the FreeBSD Foundation.



On 2020-07-29 21:10, Allan Jude wrote:

> This is the sixth weekly status report on the project to integrate ZSTD
> into OpenZFS.
>
> https://github.com/openzfs/zfs/pull/10631 - Improved the `zfs recv`
> error handling when it receives an out-of-bounds property value.
> Specifically, if a zfs send stream is created that supports a newer
> compression or checksum type, the property will fail to be set on the
> receiving system. This is fine, but `zfs recv` would abort() and create
> a core file, rather than reporting the error, because it did not
> understand the EINVAL being returned for that property. In the case
> where the property is outside the accepted range, we now return the new
> ZFS_ERR_BADPROP value, and the correct message is displayed to the user.
> I opted not to use ERANGE because that is used for 'this property value
> should not be used on a root pool'. The idea is to get this fix merged
> into the 0.8.x branch for the next point release, to improve
> compatibility with streams generated by OpenZFS 2.0
>
>
> https://github.com/openzfs/zfs/pull/10632 - General improvement to error
> handling when the error code is EZFS_UNKNOWN.
>
>
> https://github.com/allanjude/zfs/commit/8f37c1ad8edaff20a550b3df07995dab80c06492
> - ZFS replication compatibility improvements. As discussed on the
> leadership call earlier this month, keep the compatibility simple. If
> the -c flag is given, send blocks compressed with any compression
> algorithm. The improved error handling will let the user know if their
> system can't handle ZSTD.
>
>
> https://github.com/allanjude/zfs/commit/0ffd80e281f79652973378599cd0332172f365bd
> - per-dataset feature activation. This switches the ZSTD feature flag
> from 'enabled' to 'active' as soon as the property is set, instead of
> when the first block is written. This ensures that the pool can't be
> imported on a system that does not support ZSTD that will cause the ZFS
> cli tools to panic.
>
>
> I will be working on adding some tests for the feature activation.
>
> I've been looking at ways to add tests for the replication changes, but
> it doesn't seem to be easy to test the results of a 'zfs recv' that does
> not know about ZSTD (where the values are outside of the valid range for
> the enum). If anyone has any ideas here, I'd be very interested.
>
>
> On 2020-07-20 23:40, Allan Jude wrote:
>> This is the fifth weekly status report on the project to integrate ZSTD
>> into OpenZFS.
>>
>> https://github.com/c0d3z3r0/zfs/pull/14/commits/9807c99169e5931a754bb0df68267ffa2f289474
>> - Created a new test case to ensure that ZSTD compressed blocks survive
>> replication with the -c flag. We wanted to make sure the on-disk
>> compression header survived the trip.
>>
>> https://github.com/c0d3z3r0/zfs/pull/14/commits/94bef464fc304e9d6f5850391e41720c3955af11
>> - I split the zstd.c file into OS specific bits
>> (module/os/{linux,freebsd}/zstd_os.c) and also split the .h file into
>> zstd.h and zstd_impl.h. This was done so that FreeBSD can use the
>> existing kmem_cache mechanism, while Linux can use the vmem_alloc pool
>> created in the earlier versions of this patch. I significantly changed
>> the FreeBSD implementation from my earlier work, to reuse the power of 2
>> zio_data_buf_cache[]'s that already exist, only adding a few additional
>> kmem_caches for large blocks with high compression levels. This should
>> avoid creating as many unnecessary kmem caches.
>>
>> https://github.com/c0d3z3r0/zfs/pull/14/commits/3d48243b77e6c8c3bf562c7a2315dd6cc571f28c
>> - Lastly, in my testing I was seeing a lot of hits on the new
>> compression failure kstat I added. This was caused by the ZFS "early
>> abort" feature, where we give the compressor an output buffer that is
>> smaller than the input, so it will fail if the block will not compress
>> enough to be worth it. This helps avoid wasting CPU on uncompressible
>> blocks. However, it seems the 'one-file' version of zstd we are using
>> does not expose the ZSTD_ErrorCode enum. This needs to be investigated
>> further to avoid issues if the value changes (although it is apparently
>> stable after version 1.3.1).
>>
>> I am still working on a solution for zfs send stream compatibility. I am
>> leaning towards creating a new flag, --zstd, to enable ZSTD compressed
>> output. If the -e or -c flag are used without the --zstd flag, and the
>> dataset has the zstd feature active, the idea would be to emit a warning
>> but send the blocks uncompressed, so that the stream remains compatible
>> with older versions of ZFS. I will be discussing this on the OpenZFS
>> Leadership call tomorrow, and am open to suggestions on how to best
>> handle this.
>>
>>
>> On 2020-07-14 22:26, Allan Jude wrote:
>>> In my continuing effort to complete the integration of ZSTD into
>>> OpenZFS, here is my fourth weekly status report:
>>>
>>> https://github.com/allanjude/zfs/commit/b0b1270d4e7835ecff413208301375e3de2a4153
>>> - Create a new test case to make sure that the ZSTD header we write
>>> along with the data is correct. Verify that the physical size of the
>>> compressed data is less than the psize for the block pointer, and verify
>>> that the level matches. It uses a random level between 1 and 19 and then
>>> verifies with zdb that the block was compressed with that level.
>>>
>>> I am still working on a solution for setting the zstd feature flag to
>>> 'active' as soon as it is set, rather than only once a block is born. As
>>> well as fixing up compatibility around zfs send/recv with the embedded
>>> block points flag.
>>>
>>> This project is sponsored by the FreeBSD Foundation.
>>>
>>>
>>
>
>

--
Allan Jude


signature.asc (851 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: ZSTD Project Weekly Status Update

Allan Jude-9
This is the eighth weekly status report on the project to complete the
integration of ZSTD compression into OpenZFS.

https://github.com/openzfs/zfs/pull/10692 - I created some new tests
around the L2ARC to facilitate testing of ZSTD + L2ARC. These tests
found an issue (even with just lz4 compression) where if the
compressed_arc feature is disabled, the wrong size is used when
calculating the checksum of the buffer read back from the L2ARC,
resulting in a silent checksum failure. After the block from the L2ARC
fails to checksum, it is re-read from the main pool.

https://github.com/openzfs/zfs/pull/10693 - I have created a patch to
fix the issue between L2ARC and compressed_arc.

https://github.com/allanjude/zfs/commit/1f565ef0c6bd2e785fb3777c111184bb4bc551c4
- A followup to the rewritten version of the ZSTD feature activation
code. The handling of zfs_prop_set_special() was not actually setting
the property, so we return -1 so that the normal property setting
routine will be followed, in addition to the special handling.

https://github.com/allanjude/zfs/commit/8eac845a221952b3c9c52b4caf9be4bdf401e2b9
- Fixed an issue where if compression failed (this can be triggered by
"early abort", where the data is uncompressable and wont fit in the
output buffer that is 12.5% smaller than the input), it would skip the
encryption code block, which could result in data being written to the
L2ARC uncompressed and unencrypted.

Based on the above, I am considering that we might want to calculate the
checksum of the block after we re-transform it, and make sure it matches
the checksum in the blockpointer, if it does not, we just skip writing
to the L2ARC as if the block was ineligible for one of the normal
reasons. This would ensure we don't end up reading from the L2ARC only
to re-read from the main pool because the block did not survive the trip.

That leaves just the future proofing bits left (L2ARC, nop-write, etc
when newer ZSTD does not recompress the block in the same way), but that
specific bit doesn't need to block merging ZSTD support.

This project is sponsored by the FreeBSD Foundation.


On 2020-08-05 22:49, Allan Jude wrote:

> This is the seventh weekly status report on the project to integrate
> ZSTD into OpenZFS.
>
> The compatibility related changes I created last week were refined and
> marged into the mainline branch.
>
> Thanks to Brian Behlendorf for reviewing my proposed change for the zstd
> feature flag activation, and pointing out a better approach. I have
> reworked the patch based on his suggestion and prototype:
>
> https://github.com/allanjude/zfs/commit/2508dafcec0a05d61afc5fbd5da356e201afbe97
> - Activate the per-dataset ZSTD feature flag as soon as the property is
> set to ZSTD. Before, simply doing `zfs set compression=zstd dataset`
> would not activate the feature flag. The feature flag would be activated
> when the first block that used ZSTD compression was written (see
> dsl_dataset_block_born()). This meant that if you set the property,
> exported the pool, the pool would import on systems with older versions
> of ZFS that did not support ZSTD, but would crash their userspace tools,
> because the property value was out of bounds.
>
>
> https://github.com/allanjude/zfs/commit/b8bec3fd2a8feb3a4de572eb15515d3764f92a35
> - I created a test that ensures that the feature flag is activated by
> `zfs set compression=zstd` and also ensures that the feature flag
> reverts to the 'enabled' state once the last dataset using zstd is
> destroyed.
>
>
> The next step is ensuring that ZSTD compression inter-operates properly
> with the L2ARC and Encryption etc.
>
> I've also been discussing ideas with Brian about future-proofing, to
> handle the case where a newer version of ZSTD might compression the same
> input differently (better ratio), and how that would impact L2ARC,
> nop-write, etc. One idea (originally from Pawel Dawidek) is to do
> something similar to what encryption does, and split the checksum field.
> Using half to checksum the original data, and half the compressed
> version. This would allow ZFS to detect when the same content compressed
> differently (combined with the ZSTD version header in the compressed
> data), giving better compatibility as we upgrade ZSTD.
>
>
> This project is sponsored by the FreeBSD Foundation.
>
>
>

--
Allan Jude


signature.asc (851 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: ZSTD Project Weekly Status Update

Allan Jude-9
This is the ninth weekly status report on my FreeBSD Foundation
sponsored project to complete the integration of ZSTD compression into
OpenZFS.

https://github.com/openzfs/zfs/pull/10693 - The L2ARC fixes (for when
compressed ARC is disabled) have been merged.

https://github.com/openzfs/zfs/pull/10278/ - A number of other cleanups
and fixes for the ZSTD have been integrated and squashed, and it looks
like the completed ZSTD feature will be merged very soon.

This included a bunch of fixes for makefiles and runfiles to hook the
tests I added up to the ZFS test suite so they are run properly.

It looks like this will mean that the ZSTD feature will be included in
OpenZFS 2.0. Thanks for everyone who has tested, reviewed, or
contributed to this effort, especially those who kept it alive while I
was working on other things.

Post-merge, the remaining work is to develop future-proofing around ZSTD
so that we will be able to more seamlessly upgrade to newer releases of
ZSTD. Recompression of the same input resulting in the same on-disk
checksum is the main concern, as without this upgrading the compression
algorithm will break features like nop-write.

This project is sponsored by the FreeBSD Foundation.

On 2020-08-10 23:45, Allan Jude wrote:

> This is the eighth weekly status report on the project to complete the
> integration of ZSTD compression into OpenZFS.
>
> https://github.com/openzfs/zfs/pull/10692 - I created some new tests
> around the L2ARC to facilitate testing of ZSTD + L2ARC. These tests
> found an issue (even with just lz4 compression) where if the
> compressed_arc feature is disabled, the wrong size is used when
> calculating the checksum of the buffer read back from the L2ARC,
> resulting in a silent checksum failure. After the block from the L2ARC
> fails to checksum, it is re-read from the main pool.
>
> https://github.com/openzfs/zfs/pull/10693 - I have created a patch to
> fix the issue between L2ARC and compressed_arc.
>
> https://github.com/allanjude/zfs/commit/1f565ef0c6bd2e785fb3777c111184bb4bc551c4
> - A followup to the rewritten version of the ZSTD feature activation
> code. The handling of zfs_prop_set_special() was not actually setting
> the property, so we return -1 so that the normal property setting
> routine will be followed, in addition to the special handling.
>
> https://github.com/allanjude/zfs/commit/8eac845a221952b3c9c52b4caf9be4bdf401e2b9
> - Fixed an issue where if compression failed (this can be triggered by
> "early abort", where the data is uncompressable and wont fit in the
> output buffer that is 12.5% smaller than the input), it would skip the
> encryption code block, which could result in data being written to the
> L2ARC uncompressed and unencrypted.
>
> Based on the above, I am considering that we might want to calculate the
> checksum of the block after we re-transform it, and make sure it matches
> the checksum in the blockpointer, if it does not, we just skip writing
> to the L2ARC as if the block was ineligible for one of the normal
> reasons. This would ensure we don't end up reading from the L2ARC only
> to re-read from the main pool because the block did not survive the trip.
>
> That leaves just the future proofing bits left (L2ARC, nop-write, etc
> when newer ZSTD does not recompress the block in the same way), but that
> specific bit doesn't need to block merging ZSTD support.
>
> This project is sponsored by the FreeBSD Foundation.
>
>
> On 2020-08-05 22:49, Allan Jude wrote:
>> This is the seventh weekly status report on the project to integrate
>> ZSTD into OpenZFS.
>>
>> The compatibility related changes I created last week were refined and
>> marged into the mainline branch.
>>
>> Thanks to Brian Behlendorf for reviewing my proposed change for the zstd
>> feature flag activation, and pointing out a better approach. I have
>> reworked the patch based on his suggestion and prototype:
>>
>> https://github.com/allanjude/zfs/commit/2508dafcec0a05d61afc5fbd5da356e201afbe97
>> - Activate the per-dataset ZSTD feature flag as soon as the property is
>> set to ZSTD. Before, simply doing `zfs set compression=zstd dataset`
>> would not activate the feature flag. The feature flag would be activated
>> when the first block that used ZSTD compression was written (see
>> dsl_dataset_block_born()). This meant that if you set the property,
>> exported the pool, the pool would import on systems with older versions
>> of ZFS that did not support ZSTD, but would crash their userspace tools,
>> because the property value was out of bounds.
>>
>>
>> https://github.com/allanjude/zfs/commit/b8bec3fd2a8feb3a4de572eb15515d3764f92a35
>> - I created a test that ensures that the feature flag is activated by
>> `zfs set compression=zstd` and also ensures that the feature flag
>> reverts to the 'enabled' state once the last dataset using zstd is
>> destroyed.
>>
>>
>> The next step is ensuring that ZSTD compression inter-operates properly
>> with the L2ARC and Encryption etc.
>>
>> I've also been discussing ideas with Brian about future-proofing, to
>> handle the case where a newer version of ZSTD might compression the same
>> input differently (better ratio), and how that would impact L2ARC,
>> nop-write, etc. One idea (originally from Pawel Dawidek) is to do
>> something similar to what encryption does, and split the checksum field.
>> Using half to checksum the original data, and half the compressed
>> version. This would allow ZFS to detect when the same content compressed
>> differently (combined with the ZSTD version header in the compressed
>> data), giving better compatibility as we upgrade ZSTD.
>>
>>
>> This project is sponsored by the FreeBSD Foundation.
>>
>>
>>
>
>

--
Allan Jude


signature.asc (851 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: ZSTD Project Weekly Status Update

Allan Jude-9
This is the tenth weekly status report on my FreeBSD Foundation
sponsored project to complete the integration of ZSTD compression into
OpenZFS.

Late last week the main pull request was merged, and ZSTD support is now
part of OpenZFS's trunk branch.

Last night, OpenZFS with ZSTD was imported into FreeBSD's -current branch.

I am continuing to work on a number of things related to ZSTD, including
future-proofing support (so upgrading ZSTD won't cause problems with
features like nopwrite), and improving the integration of ZSTD into
FreeBSD, including enabling support for booting from ZSTD compressed
datasets, and improving the performance of ZSTD on FreeBSD.

I'll also be adding some additional tests to make sure we detect any
issues when we do look at updating ZSTD. Additionally, I am working on a
bunch of documentation around using ZSTD in ZFS.

For my benchmarking of ZSTD, I have been using a zfs recv of a stream in
a file on a tmpfs, and recording how long it takes to receive and sync
the data. The test data is a copy of the FreeBSD 12.1 source code, since
that is easily reproducible.

Does anyone have experience or a better suggestion on how to get the
most consistent and repeatable results when benchmarking like this?


On 2020-08-18 18:51, Allan Jude wrote:

> This is the ninth weekly status report on my FreeBSD Foundation
> sponsored project to complete the integration of ZSTD compression into
> OpenZFS.
>
> https://github.com/openzfs/zfs/pull/10693 - The L2ARC fixes (for when
> compressed ARC is disabled) have been merged.
>
> https://github.com/openzfs/zfs/pull/10278/ - A number of other cleanups
> and fixes for the ZSTD have been integrated and squashed, and it looks
> like the completed ZSTD feature will be merged very soon.
>
> This included a bunch of fixes for makefiles and runfiles to hook the
> tests I added up to the ZFS test suite so they are run properly.
>
> It looks like this will mean that the ZSTD feature will be included in
> OpenZFS 2.0. Thanks for everyone who has tested, reviewed, or
> contributed to this effort, especially those who kept it alive while I
> was working on other things.
>
> Post-merge, the remaining work is to develop future-proofing around ZSTD
> so that we will be able to more seamlessly upgrade to newer releases of
> ZSTD. Recompression of the same input resulting in the same on-disk
> checksum is the main concern, as without this upgrading the compression
> algorithm will break features like nop-write.
>
> This project is sponsored by the FreeBSD Foundation.
>

--
Allan Jude


signature.asc (851 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: ZSTD Project Weekly Status Update

Allan Jude-9
This is the eleventh weekly status report on my FreeBSD Foundation
sponsored project to complete the integration of ZSTD compression into
OpenZFS.

As I continue to work on the future-proofing issue, I have also been
lending a hand to the integration of OpenZFS into FreeBSD, and doing a
bunch of reviewing and testing there.

I have also been doing some benchmarking of the ZSTD feature.

so far I have tried 4 different approaches with varying results.

The test rig:
A single socket Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz (10 cores, 20
threads)
32 GB ram
ZFS ARC max = 4GB
4x Samsung 860 EVO SSDs


1) using fio. This gives slightly more useful output, both bandwidth and
IOPS but also has more detail about changes over time as well as latency
etc.

The downside here is that it doesn't really benchmark compression. By
default fio uses fully random data that does not compress at all. This
is a somewhat useful metric, and the differing results seen when varying
blocksize is interesting.

fio has an option, --buffer_compress_percentage=, to select how
compressible you want the generated data to be. However, this just
switches between random data, and a repeating pattern (by default null
bytes). So different levels of zstd compression all give the same
compression ratio (the level you ask fio to generate). This doesn't
really provide the real-work use case of having a tradeoff where
spending more time on compression results in a greater space savings.

2) I also used 'zfs recv' to create more repeatable writes. I generated
a large dataset, 8 copies of the FreeBSD 12.1 source code, that rounds
out to around 48 GB of uncompressed data, snapshoted it, and created a
zfs send stream, stored on a tmpfs. Then I measured the time taken to
zfs recv that stream, at different compression levels. I later also
redid these experiments at different record sizes.

The reason I chose to use 8 copies of the data was to make the runs long
enough at the lower compression levels to get more consistent readings.

The issue with this was also a limitation of my test setup, 4x striped
SSDs, that tends to top out around 1.3 GB/sec of writes. So the
difference between compression=off, lz4, and zstd-1 was minimal.

3) I then the zfs recv based testing, but with only 1 copy of the source
code (1.3 GB) but with the pool backed by a file on a tmpfs. Removing
the SSDs from the equation. The raw write speed to the tmpfs was around
3.2GB/sec.

4) I also redid the fio based testing with a pool backed by a file on tmpfs.


I am not really satisfied with the quality of the results so far.

Does Linux have something equivalent to FreeBSD's mdconfig, where I can
create an arbitrarily number of arbitrarily sized memory-backed devices,
that I could use to back the pool? A file-based vdev on a tmpfs just
doesn't seem to provide the same type of results as I was expecting.

Any other suggestions would be welcome.



In the end the results will all be relative, which is mostly what we are
looking to capture. How much faster/slow is zstd at different levels
compared to lz4 and gzip, and how much more compression do you get in
exchange for that trade-off.

Hopefully next week there will be some pretty graphs.

Thanks again to the FreeBSD Foundation for sponsoring this work.


On 2020-08-25 22:22, Allan Jude wrote:

> This is the tenth weekly status report on my FreeBSD Foundation
> sponsored project to complete the integration of ZSTD compression into
> OpenZFS.
>
> Late last week the main pull request was merged, and ZSTD support is now
> part of OpenZFS's trunk branch.
>
> Last night, OpenZFS with ZSTD was imported into FreeBSD's -current branch.
>
> I am continuing to work on a number of things related to ZSTD, including
> future-proofing support (so upgrading ZSTD won't cause problems with
> features like nopwrite), and improving the integration of ZSTD into
> FreeBSD, including enabling support for booting from ZSTD compressed
> datasets, and improving the performance of ZSTD on FreeBSD.
>
> I'll also be adding some additional tests to make sure we detect any
> issues when we do look at updating ZSTD. Additionally, I am working on a
> bunch of documentation around using ZSTD in ZFS.
>
> For my benchmarking of ZSTD, I have been using a zfs recv of a stream in
> a file on a tmpfs, and recording how long it takes to receive and sync
> the data. The test data is a copy of the FreeBSD 12.1 source code, since
> that is easily reproducible.
>
> Does anyone have experience or a better suggestion on how to get the
> most consistent and repeatable results when benchmarking like this?
>
>
> On 2020-08-18 18:51, Allan Jude wrote:
>> This is the ninth weekly status report on my FreeBSD Foundation
>> sponsored project to complete the integration of ZSTD compression into
>> OpenZFS.
>>
>> https://github.com/openzfs/zfs/pull/10693 - The L2ARC fixes (for when
>> compressed ARC is disabled) have been merged.
>>
>> https://github.com/openzfs/zfs/pull/10278/ - A number of other cleanups
>> and fixes for the ZSTD have been integrated and squashed, and it looks
>> like the completed ZSTD feature will be merged very soon.
>>
>> This included a bunch of fixes for makefiles and runfiles to hook the
>> tests I added up to the ZFS test suite so they are run properly.
>>
>> It looks like this will mean that the ZSTD feature will be included in
>> OpenZFS 2.0. Thanks for everyone who has tested, reviewed, or
>> contributed to this effort, especially those who kept it alive while I
>> was working on other things.
>>
>> Post-merge, the remaining work is to develop future-proofing around ZSTD
>> so that we will be able to more seamlessly upgrade to newer releases of
>> ZSTD. Recompression of the same input resulting in the same on-disk
>> checksum is the main concern, as without this upgrading the compression
>> algorithm will break features like nop-write.
>>
>> This project is sponsored by the FreeBSD Foundation.
>>
>
>

--
Allan Jude


signature.asc (851 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: ZSTD Project Weekly Status Update

Allan Jude-9
This is another weekly status report on my FreeBSD Foundation sponsored
project to complete the integration of ZSTD compression into OpenZFS.

The first batch of benchmarks are complete, although they took longer
than expected to get good data.

I am still not entirely pleased with the data, as in some cases I am
running up against limitations of my device-under-test rather than the
performance limits of ZFS.

Here is what I have so far:

https://docs.google.com/spreadsheets/d/1TvCAIDzFsjuLuea7124q-1UtMd0C9amTgnXm2yPtiUQ/edit?usp=sharing

A number of these tests were initially done on both FreeBSD and Linux on
the same machine, and the results were consistent within a 2% margin of
error, so I've taken to doing most of the tests only on FreeBSD, because
it is easier. I've struggled to get a good ramdisk solution on Ubuntu etc.

To walk you through the different tabs in the spreadsheet so far:

#1: fio SSD
This is a random write test to my pool made of 4 SSDs. This ran into the
performance limitations of the SSDs when testing the very fast
algorithms. Since the data generated by fio is completely
uncompressible, there is no gain from the higher compression levels.

#2: fio to ramdisk
To overcome the limitations of the first test, I did it again with a
ramdisk. Obviously this had to be a smaller dataset, since there is
limited memory available, but it does a much better job of showing how
the zstd-fast levels scale, and how they outperform LZ4, although you
cannot compare the compression, because the data is uncompressible.

#3: zfs recv to SSD
For this test, I created a dataset by extracting the FreeBSD src.txz
file 8 times (each to a different directory), then created a snapshot of
that, and send it to a file on a tmpfs.

I then timed zfs recv < /tmpfs/snapshot.zfs with each compression
algorithm. This allows you to compare the compression gain for the time
trade-off, but again ran into the throughput limitations of the SSDs, so
provides a bit less information about the performance of the higher
zstd-fast levels, but you can see the compression tradeoff.

I need to reconfigure my setup to re-do this benchmark using a ramdisk.

#4: large image file 128k
For this, i created an approximately 20GB tar file, by unxz'ding the
FreeBSD 12.1 src.txz and concatenating it 16 times. This provides the
best possible case for compression.

One of the major advantages of ZSTD is that the decompression throughput
stays relatively the same even as the compression level is increased. So
while writing a zstd-19 compressed block takes a lot longer than a
zstd-3 compressed block, both decompress at nearly the same speed.

This time I measured fio random read performance. Putting the
limitations of the SSDs to good use, this test shows the read
performance gains from reading compressed. Even though the disks top out
around 1.5 GB/sec, zstd-compressed data can be read at an effective rate
of over 5 GB/sec.

#5: large image file 1m
This is the same test, but done with zfs recordsize=1m

The larger record size unlocks higher compression ratios, and achieves
throughputs in excess of 6 GB/sec.

#6: large image file 16k
This is again the same test, but with zfs recordsize=16k
This is an approximation of reading from a large database with a 16k
page size.
The lower record size provides much less compression, and the smaller
blocks result in more overhead, but, there are still large performance
gains to be had from the compression, although they are much less drastic.

I would be interested in what other tests people might be interested in
seeing before I finish wearing these SSDs out.


Thanks again to the FreeBSD Foundation for sponsoring this work.


On 2020-08-31 23:21, Allan Jude wrote:

> This is the eleventh weekly status report on my FreeBSD Foundation
> sponsored project to complete the integration of ZSTD compression into
> OpenZFS.
>
> As I continue to work on the future-proofing issue, I have also been
> lending a hand to the integration of OpenZFS into FreeBSD, and doing a
> bunch of reviewing and testing there.
>
> I have also been doing some benchmarking of the ZSTD feature.
>
> so far I have tried 4 different approaches with varying results.
>
> The test rig:
> A single socket Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz (10 cores, 20
> threads)
> 32 GB ram
> ZFS ARC max = 4GB
> 4x Samsung 860 EVO SSDs
>
>
> 1) using fio. This gives slightly more useful output, both bandwidth and
> IOPS but also has more detail about changes over time as well as latency
> etc.
>
> The downside here is that it doesn't really benchmark compression. By
> default fio uses fully random data that does not compress at all. This
> is a somewhat useful metric, and the differing results seen when varying
> blocksize is interesting.
>
> fio has an option, --buffer_compress_percentage=, to select how
> compressible you want the generated data to be. However, this just
> switches between random data, and a repeating pattern (by default null
> bytes). So different levels of zstd compression all give the same
> compression ratio (the level you ask fio to generate). This doesn't
> really provide the real-work use case of having a tradeoff where
> spending more time on compression results in a greater space savings.
>
> 2) I also used 'zfs recv' to create more repeatable writes. I generated
> a large dataset, 8 copies of the FreeBSD 12.1 source code, that rounds
> out to around 48 GB of uncompressed data, snapshoted it, and created a
> zfs send stream, stored on a tmpfs. Then I measured the time taken to
> zfs recv that stream, at different compression levels. I later also
> redid these experiments at different record sizes.
>
> The reason I chose to use 8 copies of the data was to make the runs long
> enough at the lower compression levels to get more consistent readings.
>
> The issue with this was also a limitation of my test setup, 4x striped
> SSDs, that tends to top out around 1.3 GB/sec of writes. So the
> difference between compression=off, lz4, and zstd-1 was minimal.
>
> 3) I then the zfs recv based testing, but with only 1 copy of the source
> code (1.3 GB) but with the pool backed by a file on a tmpfs. Removing
> the SSDs from the equation. The raw write speed to the tmpfs was around
> 3.2GB/sec.
>
> 4) I also redid the fio based testing with a pool backed by a file on tmpfs.
>
>
> I am not really satisfied with the quality of the results so far.
>
> Does Linux have something equivalent to FreeBSD's mdconfig, where I can
> create an arbitrarily number of arbitrarily sized memory-backed devices,
> that I could use to back the pool? A file-based vdev on a tmpfs just
> doesn't seem to provide the same type of results as I was expecting.
>
> Any other suggestions would be welcome.
>
>
>
> In the end the results will all be relative, which is mostly what we are
> looking to capture. How much faster/slow is zstd at different levels
> compared to lz4 and gzip, and how much more compression do you get in
> exchange for that trade-off.
>
> Hopefully next week there will be some pretty graphs.
>
> Thanks again to the FreeBSD Foundation for sponsoring this work.
>
>
--
Allan Jude


signature.asc (851 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: ZSTD Project Weekly Status Update

Allan Jude-9
This is another weekly status report on my FreeBSD Foundation sponsored
project to complete the integration of ZSTD compression into OpenZFS.

I have continued to work on the benchmarks, including attempting to
compare compression and performance across various settings with pgbench
(PostgreSQL). However the results are not as meaningful as I would like
so far.

I am still trying some other configurations to try to get clearer
results. Using multiple md(4) devices has helped avoid that being the
bottleneck.

https://docs.google.com/spreadsheets/d/1TvCAIDzFsjuLuea7124q-1UtMd0C9amTgnXm2yPtiUQ/edit?usp=sharing

Additionally, I completed rewriting my earlier commit to convert the
FreeBSD versions of zstd to use FreeBSD's UMA kernel memory allocator,
rather than the custom mempool used on Linux due to its kmem_cache
abstraction not being able to handle large allocations:

https://github.com/openzfs/zfs/pull/10975

I am working on some benchmarks of this change as well.

Thank you.

On 2020-09-15 22:11, Allan Jude wrote:

> This is another weekly status report on my FreeBSD Foundation sponsored
> project to complete the integration of ZSTD compression into OpenZFS.
>
> The first batch of benchmarks are complete, although they took longer
> than expected to get good data.
>
> I am still not entirely pleased with the data, as in some cases I am
> running up against limitations of my device-under-test rather than the
> performance limits of ZFS.
>
> Here is what I have so far:
>
> https://docs.google.com/spreadsheets/d/1TvCAIDzFsjuLuea7124q-1UtMd0C9amTgnXm2yPtiUQ/edit?usp=sharing
>
> A number of these tests were initially done on both FreeBSD and Linux on
> the same machine, and the results were consistent within a 2% margin of
> error, so I've taken to doing most of the tests only on FreeBSD, because
> it is easier. I've struggled to get a good ramdisk solution on Ubuntu etc.
>
> To walk you through the different tabs in the spreadsheet so far:
>
> #1: fio SSD
> This is a random write test to my pool made of 4 SSDs. This ran into the
> performance limitations of the SSDs when testing the very fast
> algorithms. Since the data generated by fio is completely
> uncompressible, there is no gain from the higher compression levels.
>
> #2: fio to ramdisk
> To overcome the limitations of the first test, I did it again with a
> ramdisk. Obviously this had to be a smaller dataset, since there is
> limited memory available, but it does a much better job of showing how
> the zstd-fast levels scale, and how they outperform LZ4, although you
> cannot compare the compression, because the data is uncompressible.
>
> #3: zfs recv to SSD
> For this test, I created a dataset by extracting the FreeBSD src.txz
> file 8 times (each to a different directory), then created a snapshot of
> that, and send it to a file on a tmpfs.
>
> I then timed zfs recv < /tmpfs/snapshot.zfs with each compression
> algorithm. This allows you to compare the compression gain for the time
> trade-off, but again ran into the throughput limitations of the SSDs, so
> provides a bit less information about the performance of the higher
> zstd-fast levels, but you can see the compression tradeoff.
>
> I need to reconfigure my setup to re-do this benchmark using a ramdisk.
>
> #4: large image file 128k
> For this, i created an approximately 20GB tar file, by unxz'ding the
> FreeBSD 12.1 src.txz and concatenating it 16 times. This provides the
> best possible case for compression.
>
> One of the major advantages of ZSTD is that the decompression throughput
> stays relatively the same even as the compression level is increased. So
> while writing a zstd-19 compressed block takes a lot longer than a
> zstd-3 compressed block, both decompress at nearly the same speed.
>
> This time I measured fio random read performance. Putting the
> limitations of the SSDs to good use, this test shows the read
> performance gains from reading compressed. Even though the disks top out
> around 1.5 GB/sec, zstd-compressed data can be read at an effective rate
> of over 5 GB/sec.
>
> #5: large image file 1m
> This is the same test, but done with zfs recordsize=1m
>
> The larger record size unlocks higher compression ratios, and achieves
> throughputs in excess of 6 GB/sec.
>
> #6: large image file 16k
> This is again the same test, but with zfs recordsize=16k
> This is an approximation of reading from a large database with a 16k
> page size.
> The lower record size provides much less compression, and the smaller
> blocks result in more overhead, but, there are still large performance
> gains to be had from the compression, although they are much less drastic.
>
> I would be interested in what other tests people might be interested in
> seeing before I finish wearing these SSDs out.
>
>
> Thanks again to the FreeBSD Foundation for sponsoring this work.
>
>
--
Allan Jude


signature.asc (851 bytes) Download Attachment