RAID10 stripe size and PostgreSQL performance

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

RAID10 stripe size and PostgreSQL performance

Artem Naluzhnyy-4
Hi,

I'm benchmarking PostgreSQL using different RAID10 stripe size values
for a new server. Tried bonnie++ and pgbench on two stripe size
configurations:

  * 32 KB (a half of current UFS bsize) - 254 pgbench tps
  * 1 MB (max supported by the RAID controller) - 626 pgbench tps

See OS/hardware configuration, benchmark methodology and raw results
here - http://pastebin.com/F8uZEZdm

Is this expected behavior with more than twice higher pgbench tps on
1MB stripe size?

Are there any RAID stripe size recommendations for better PostgreSQL
performance? (I can not change the FS type, standard PG block size
etc. - they are locked by vendor in this commercial FreeBSD
distribution)

--
Artem Naluzhnyy
_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-database
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: RAID10 stripe size and PostgreSQL performance

Artem Naluzhnyy-2
On Mon, Jul 8, 2013 at 6:16 PM, Ivan Voras <[hidden email]> wrote:
> On 08/07/2013 14:40, Artem Naluzhnyy wrote:
>> Is this expected behavior with more than twice higher pgbench tps on
>> 1MB stripe size?
>
> No, it is not.
>
> For start, can you please repeat your benchmarks but with restarting the
> PostgreSQL server between each pgbench run?

Fresh OS installation without DB warning, reboot after pgbench DB
initialization (DB size: 26 GB) before benchmarking:

  * 32 KB (half of the UFS bsize) - tps=198

  * 64 KB - tps=226

  * 128 KB (default for the RAID controller) - tps=298

  * 1 MB (max for the RAID controller) - tps=347


> Also, you should make sure that the database is located on the same
> location on the disk platters by e.g. creating a small partition which
> is about 150% larger than your pgbench database (and your pgbench
> database should be at least 2x larger than your RAM, if you are going to
> benchmark IO and not memory caches), which is located at the same
> position (byte offset) in your RAID10 volume.

Unfortunately it's not that easy to make a custom partitioning.
However, all tests were done just after the server reinstallation
using exactly the same order of commands.


The server has 24 GB RAM, so with 88 GB DB we have:

  * 32 KB stripe - tps=161

  * 1 MB stripe - tps=258


The server is used for VoIP billing, there are also lots of plain-text
log files dumping. Had it still better use 1 MB stripe size, or it
might have some side effects on performance.

--
Artem Naluzhnyy
_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-database
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: RAID10 stripe size and PostgreSQL performance

Ivan Voras
On 11/07/2013 23:07, Artem Naluzhnyy wrote:

> On Mon, Jul 8, 2013 at 6:16 PM, Ivan Voras <[hidden email]> wrote:
>> On 08/07/2013 14:40, Artem Naluzhnyy wrote:
>>> Is this expected behavior with more than twice higher pgbench tps on
>>> 1MB stripe size?
>>
>> No, it is not.
>>
>> For start, can you please repeat your benchmarks but with restarting the
>> PostgreSQL server between each pgbench run?
>
> Fresh OS installation without DB warning, reboot after pgbench DB
> initialization (DB size: 26 GB) before benchmarking:
>
>   * 32 KB (half of the UFS bsize) - tps=198
>
>   * 64 KB - tps=226
>
>   * 128 KB (default for the RAID controller) - tps=298
>
>   * 1 MB (max for the RAID controller) - tps=347
I just looked at your RAID configuration at http://pastebin.com/F8uZEZdm
and you have a mirror of stripes (RAID-01) nor a stripe of mirrors
(RAID-10). And apparently, is I parse your configuration correctly, you
have a 1M stripe in the MIRROR part of the RAID, and an unknown stripe
size in the STRIPE part.

Mirroring may halp your read performance, but will not help your write
performance. If you are running pgbench with default settings, and with
your test database size which can fit in RAM, you probably cache all
reads eventually and then writes become the bottleneck.

>> Also, you should make sure that the database is located on the same
>> location on the disk platters by e.g. creating a small partition which
>> is about 150% larger than your pgbench database (and your pgbench
>> database should be at least 2x larger than your RAM, if you are going to
>> benchmark IO and not memory caches), which is located at the same
>> position (byte offset) in your RAID10 volume.
>
> Unfortunately it's not that easy to make a custom partitioning.
> However, all tests were done just after the server reinstallation
> using exactly the same order of commands.
I'm not saying that your production database should be on a custom
partition, but your pgbench test database (and the file for the
following test) should.

Anyway, could you please do one more test:

1) create a large file with "dd if=/dev/zero of=file bs=1m count=48000"
2) install /usr/ports/benchmarks/randomio
3) run "randomio file 8 0.5 1 8192 10 10"

... and report the results.



signature.asc (269 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: RAID10 stripe size and PostgreSQL performance

Artem Naluzhnyy-4
On Fri, Jul 12, 2013 at 4:55 PM, Ivan Voras <[hidden email]> wrote:
> I just looked at your RAID configuration at http://pastebin.com/F8uZEZdm
> and you have a mirror of stripes (RAID-01) nor a stripe of mirrors
> (RAID-10). And apparently, is I parse your configuration correctly, you
> have a 1M stripe in the MIRROR part of the RAID, and an unknown stripe
> size in the STRIPE part.

This is probably a bug in mfiutil output. There is no "RAID 01" option
in the controller configuration, and its documentation says
(http://goo.gl/6X5pe):

"RAID 10, a combination of RAID 0 and RAID 1, consists of striped data
across mirrored spans. A RAID 10 drive group is a spanned drive group
that creates a striped set from a series of mirrored drives. RAID 10
allows a maximum of eight spans. You must use an even number of
configuration Scenarios 1-7 drives in each RAID virtual drive in the
span. The RAID 1 virtual drives must have the same stripe size."

There is also no options to configure a different stripe size for the
mirrors, I can only set it globally for the whole RAID 10 volume.


> Anyway, could you please do one more test:
>
> 1) create a large file with "dd if=/dev/zero of=file bs=1m count=48000"
> 2) install /usr/ports/benchmarks/randomio
> 3) run "randomio file 8 0.5 1 8192 10 10"
>
> ... and report the results.

See results at the end of http://pastebin.com/F8uZEZdm


There is yet another issue that makes (I guess it should) all previous
benchmarks kinda inaccurate and irrelevant - looks like the the UFS
partitions are not aligned properly:

$ gpart show
=>        63  1167966145  mfid0  MBR  (557G)
          63  1167957567      1  freebsd  [active]  (556G)
  1167957630        8578         - free -  (4.2M)

=>         0  1167957567  mfid0s1  BSD  (556G)
           0     4194304        1  freebsd-ufs  (2.0G)
     4194304    16777216        2  freebsd-swap  (8.0G)
    20971520  1130217472        5  freebsd-ufs  (539G)
  1151188992    16768575        4  freebsd-ufs  (8G)


Will also try to fix the alignment and make some tests.

--
Artem Naluzhnyy
_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-database
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: RAID10 stripe size and PostgreSQL performance

John Baldwin
On Friday, July 12, 2013 3:15:03 pm Artem Naluzhnyy wrote:

> On Fri, Jul 12, 2013 at 4:55 PM, Ivan Voras <[hidden email]> wrote:
> > I just looked at your RAID configuration at http://pastebin.com/F8uZEZdm
> > and you have a mirror of stripes (RAID-01) nor a stripe of mirrors
> > (RAID-10). And apparently, is I parse your configuration correctly, you
> > have a 1M stripe in the MIRROR part of the RAID, and an unknown stripe
> > size in the STRIPE part.
>
> This is probably a bug in mfiutil output. There is no "RAID 01" option
> in the controller configuration, and its documentation says
> (http://goo.gl/6X5pe):
>
> "RAID 10, a combination of RAID 0 and RAID 1, consists of striped data
> across mirrored spans. A RAID 10 drive group is a spanned drive group
> that creates a striped set from a series of mirrored drives. RAID 10
> allows a maximum of eight spans. You must use an even number of
> configuration Scenarios 1-7 drives in each RAID virtual drive in the
> span. The RAID 1 virtual drives must have the same stripe size."
>
> There is also no options to configure a different stripe size for the
> mirrors, I can only set it globally for the whole RAID 10 volume.

It is true that mfi only does stripes across RAID mirrors.  mfiutil depends on
the secondary raid level being set in the ddf info for detecting a RAID-10 vs
a RAID-1, but not all mfi BIOS-configured volumes have that set.  It should
probably check if a volume spans multiple arrays instead.

--
John Baldwin
_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-database
To unsubscribe, send any mail to "[hidden email]"