Running commands from the shell with a timeout (pt 2)

Fernando Ipar

Running commands from the shell with a timeout (pt 2)

Fernando Ipar

Rewriting Highbase in Erlang

Fernando Ipar

Rewriting Highbase in Erlang

Fernando Ipar

MySQL Certification self study

Fernando Ipar

MySQL Certification self study

Fernando Ipar

Using the ENUM data type to increase performance

Fernando Ipar

Using MySQL Proxy to benchmark query performance

Fernando Ipar

Using MySQL Proxy to benchmark query performance

Fernando Ipar

Making use of procedure analyse()

Fernando Ipar

Making use of procedure analyse()

Fernando Ipar

Extending procedure_analyse

Fernando Ipar

Extending procedure_analyse

Fernando Ipar

iptables trick to limit concurrent tcp connections

Fernando Ipar

Indexing text columns in MySQL

Fernando Ipar

Generating data with dbmonster

Fernando Ipar

Generating data with dbmonster

Fernando Ipar

how to prevent explain from executing subqueries

Fernando Ipar

how to prevent explain from executing subqueries

Fernando Ipar

My agenda for MySQL NoSQL Cloud 2013

Fernando Ipar

My agenda for MySQL NoSQL Cloud 2013

Fernando Ipar

RMySQL basics

Fernando Ipar

Ongoing MySQL myths

Fernando Ipar

Ongoing MySQL myths

Fernando Ipar

Benchmarking Joomla

Fernando Ipar

Recovering MySQL access

Fernando Ipar

Using the ENUM data type to increase performance

Fernando Ipar

iptables trick to limit concurrent tcp connections

Fernando Ipar

Indexing text columns in MySQL

Fernando Ipar

Migrating to jekyll

Fernando Ipar

Ongoing MySQL myths

Fernando Ipar

Best ETA ever

Fernando Ipar

how to prevent explain from executing subqueries

Fernando Ipar

Migrating to jekyll

Fernando Ipar

Generating random salts from bash

Fernando Ipar

Vagrant environment for MySQL Workshop

Fernando Ipar

SOUNDEX(), triggers, and stored procedures

Fernando Ipar

thread safe attribute accessors for ruby classes, using clojure Atoms

Fernando Ipar

Using MySQL sandbox for testing

Fernando Ipar

Updated mysql-proxy benchmarking script (for proxy 0.7)

Fernando Ipar

RMySQL basics

Fernando Ipar

thread safe accessors for ruby classes using clojure atoms, now packaged as gem

Fernando Ipar

Console histograms

Fernando Ipar

I love playing Monopoly

Fernando Ipar

Some thoughts on org-mode/babel

Fernando Ipar

Some thoughts on org-mode/babel

Fernando Ipar

High Availability MySQL Cookbook review

Fernando Ipar

Scriptlance for musicians?

Fernando Ipar

Piping data to multiple processes

Fernando Ipar

New release of MySQL Proxy GPL

Fernando Ipar

Migrating to jekyll

Fernando Ipar

Migrating to jekyll

Fernando Ipar

Running commands from the shell with a timeout

Fernando Ipar

A new name and a wider scope for South America's favourite MySQL-related conference

Fernando Ipar

A new name and a wider scope for South America's favourite MySQL-related conference

Fernando Ipar

A new name and a wider scope for South America's favourite MySQL-related conference

Fernando Ipar

processing data to obtain meaningful graphic representations

Fernando Ipar

A new name and a wider scope for South America's favourite MySQL-related conference

Fernando Ipar

Extending procedure_analyse

Fernando Ipar

Recovering MySQL access恢复MySQL访问

Fernando Ipar

Intrusion detection at the application level, for PHP

Fernando Ipar

Presenting at Percona Live NY 2012

Fernando Ipar

MySQL can be evil

Fernando Ipar

MySQL can be evil

Fernando Ipar

Online notes

Fernando Ipar

Using the ENUM data type to increase performance

Fernando Ipar

Using MySQL Proxy to benchmark query performance

Fernando Ipar

Coding Buddies

Fernando Ipar

Using the ENUM data type to increase performance

Fernando Ipar

Opening urls from emacs using Google Chrome on MacOS

Fernando Ipar

Opening urls from emacs using Google Chrome on MacOS

Fernando Ipar

Opening urls from emacs using Google Chrome on MacOS

Fernando Ipar

Expanding your keyboard on emacs with the Hyper key

Fernando Ipar

Expanding your keyboard on emacs with the Hyper key

Fernando Ipar

Expanding your keyboard on emacs with the Hyper key

Fernando Ipar

Measuring the potential overhead of pmm-client on MySQL workloads

Fernando Ipar

Measuring the potential overhead of pmm-client on MySQL workloads

Fernando Ipar

Top 25 most dangerous programming errors

Fernando Ipar

Coffee May or May not cause hallucinations

Fernando Ipar

Running sysbench-based benchmarks against Cassandra

Fernando Ipar

Running sysbench-based benchmarks against Cassandra

Fernando Ipar

What is a good cache hit ratio?

Fernando Ipar

What is a good cache hit ratio?

Fernando Ipar

Running sysbench-based benchmarks against Cassandra

Fernando Ipar

Piping data to multiple processes

Fernando Ipar

My org-mode workflow

Fernando Ipar

My org-mode workflow

Fernando Ipar

My org-mode workflow

Fernando Ipar

Scriptlance for musicians?

Fernando Ipar

Running commands from the shell with a timeout

Fernando Ipar

Running commands from the shell with a timeout (pt 2)

Fernando Ipar

Top 25 most dangerous programming errors

Fernando Ipar

Coffee May or May not cause hallucinations

Fernando Ipar

Symbolics Lisp Machine

Fernando Ipar

Rewriting Highbase in Erlang

Fernando Ipar

Coding Buddies

Fernando Ipar

MySQL Certification self study

Fernando Ipar

Using the ENUM data type to increase performance

Fernando Ipar

High Availability MySQL Cookbook review

Fernando Ipar

Piping data to multiple processes

Fernando Ipar

Presenting at Percona Live NY 2012

Fernando Ipar

Best ETA ever

Fernando Ipar

Console histograms

Fernando Ipar

Online notes

Fernando Ipar

Vagrant environment for MySQL Workshop

Fernando Ipar

RMySQL basics

Fernando Ipar

processing data to obtain meaningful graphic representations

Fernando Ipar

Recovering MySQL access

Fernando Ipar

thread safe attribute accessors for ruby classes, using clojure Atoms

Fernando Ipar

Some thoughts on org-mode/babel

Fernando Ipar

A new name and a wider scope for South America's favourite MySQL-related conference

Fernando Ipar

MySQL can be evil

Fernando Ipar

Opening urls from emacs using Google Chrome on MacOS

Fernando Ipar

Top 8 things every database practitioner should know

Fernando Ipar

Top 8 things every database practitioner should know

Fernando Ipar

Top 8 things every database practitioner should know

Fernando Ipar

Minimal networking knowledge recommended for DBAs

Fernando Ipar

Running commands from the shell with a timeout

Fernando Ipar

My org-mode workflow Мой рабочий режим в режиме орг

Fernando Ipar

Questions to ask when evaluating a database

Fernando Ipar

Questions to ask when evaluating a database

Fernando Ipar

Documenting emacs hyper shortcuts

Fernando Ipar

Documenting emacs hyper shortcuts

Fernando Ipar

This is a test post generated from org-mode

Fernando Ipar

Rewriting Highbase in Erlang

Fernando Ipar

Seems a bit complicated. Have a look at:
http://www.pixelbeat.org/sc...

Also note that there will be a timeout command included in newer versions of coreutils.

Pádraig Brady

Seems a bit complicated. Have a look at:
http://www.pixelbeat.org/sc...

Also note that there will be a timeout command included in newer versions of coreutils.

Pádraig Brady

Thanks for the reply. I had two similar replies in reddit, I guess I was really out of touch.

I've just installed timeout through apt-get and it does just what my script from Highbase does, though I'll wait until it's included in coreutils to remove my self made version.

I tried to figure out when timeout is coming with coreutils, and it seems you're the contributor :) Any ideas for a date?

fernando

Thanks for the reply. I had two similar replies in reddit, I guess I was really out of touch.

I've just installed timeout through apt-get and it does just what my script from Highbase does, though I'll wait until it's included in coreutils to remove my self made version.

I tried to figure out when timeout is coming with coreutils, and it seems you're the contributor :) Any ideas for a date?

fernando

Hi Fernando, before start using gen_server you should take a deeper look into gen_fsm. It's a behaviour for finite state machines as you've described above. And it's as easy and powerful as gen_server. Regards, mue

Frank Mueller

Hi Fernando, before start using gen_server you should take a deeper look into gen_fsm. It's a behaviour for finite state machines as you've described above. And it's as easy and powerful as gen_server. Regards, mue

Frank Mueller

Hi Frank,

Thanks for your recommendation.
I did read a couple of examples of gen_fsm and yes, it is better for this design, I guess I put it in the back of my mind, so it's good to have someone more experienced in Erlang/OTP remind me about it :)

I'm still trying to put all the pieces together, I'm definitely trying to make a good use of the event handling mechanism.

Going through Armstrong's book for, like, the fourth time now :)

Regards, Fernando.

fernando

Hi Frank,

Thanks for your recommendation.
I did read a couple of examples of gen_fsm and yes, it is better for this design, I guess I put it in the back of my mind, so it's good to have someone more experienced in Erlang/OTP remind me about it :)

I'm still trying to put all the pieces together, I'm definitely trying to make a good use of the event handling mechanism.

Going through Armstrong's book for, like, the fourth time now :)

Regards, Fernando.

fernando

Should have seen this a fortnight back. And would not have gone to put down code for Open PHP MyProfiler http://www.php-trivandrum.o....

Open PHP MyProfiler is just a trial to run query profiling on a php-mysql application, without changing the architecture too much. The profiler is open and downloadable, though the analzer, is just a mockup or a bare one and not yet ripe to be opened up. Any one who needs to do the analysis could make use of the same by downloading the profiler, and implementing with their code. The profiler would create logs depending on hostname and date. upload the profile logs to our Profile Sampler and you should be able to see the full profile of your application.

php trivandrum

Should have seen this a fortnight back. And would not have gone to put down code for Open PHP MyProfiler http://www.php-trivandrum.o....

Open PHP MyProfiler is just a trial to run query profiling on a php-mysql application, without changing the architecture too much. The profiler is open and downloadable, though the analzer, is just a mockup or a bare one and not yet ripe to be opened up. Any one who needs to do the analysis could make use of the same by downloading the profiler, and implementing with their code. The profiler would create logs depending on hostname and date. upload the profile logs to our Profile Sampler and you should be able to see the full profile of your application.

php trivandrum

I did try this out.. first thing, in your lua script, should it not be "in", instead of "inj" as parameter for read_query_result()?

Also to achieve this, you dont need the read_query and the global query, since in.query will give the original query as read here: http://dev.mysql.com/doc/my...

php trivandrum

I did try this out.. first thing, in your lua script, should it not be "in", instead of "inj" as parameter for read_query_result()?

Also to achieve this, you dont need the read_query and the global query, since in.query will give the original query as read here: http://dev.mysql.com/doc/my...

php trivandrum

Sorry, I was wrong in the second part. To trigger read_query_result, the first injection is needed

php trivandrum

Sorry, I was wrong in the second part. To trigger read_query_result, the first injection is needed

php trivandrum

Hi,

This was my first lua script, so some parts might have been wrong. Still, I think the first injection was needed, and your third message seems to confirm this so it gives me some confidence ;)

Regarding your second message, it's possible. There was a new release of mysql-proxy just after I published this, and apparently, some changes rendered my script obsolete. I've joined the mysql-proxy team on launchpad to try to keep a closer eye on the project and I promise to make an updated post on this blog with a new script that works.

In fact, if it's approved by the team, I'll publish in the cookbook where it'll be much more useful.

I think your project is quite interesting and might be in fact easier to implement for people working on the LAMP stack and on shared hosting environments (where installing the proxy is totally out of the question), so please keep up the good work!

fernando

Hi,

This was my first lua script, so some parts might have been wrong. Still, I think the first injection was needed, and your third message seems to confirm this so it gives me some confidence ;)

Regarding your second message, it's possible. There was a new release of mysql-proxy just after I published this, and apparently, some changes rendered my script obsolete. I've joined the mysql-proxy team on launchpad to try to keep a closer eye on the project and I promise to make an updated post on this blog with a new script that works.

In fact, if it's approved by the team, I'll publish in the cookbook where it'll be much more useful.

I think your project is quite interesting and might be in fact easier to implement for people working on the LAMP stack and on shared hosting environments (where installing the proxy is totally out of the question), so please keep up the good work!

fernando

hi fernando,

I never thought about the shared hosting part, thanks for boosting my morale, I would be quite happy if you would comment on my relevant page

http://www.php-trivandrum.o...

php trivandrum

hi fernando,

I never thought about the shared hosting part, thanks for boosting my morale, I would be quite happy if you would comment on my relevant page

http://www.php-trivandrum.o...

php trivandrum

Awesome! I never realized you could grab the output of a PROCEDURE ANALYSE() like that.

Thanks!

Roland Bouman

Awesome! I never realized you could grab the output of a PROCEDURE ANALYSE() like that.

Thanks!

Roland Bouman

Run ANALYSE(10,100) rather than without params. This sets the max# of ENUM items to 10 and a total of 100 bytes in the ENUM list.
In short, it gets rid of the nonsensical default of trying to stick everything in an enum.

See http://bugs.mysql.com/2049

Arjen Lentz

Run ANALYSE(10,100) rather than without params. This sets the max# of ENUM items to 10 and a total of 100 bytes in the ENUM list.
In short, it gets rid of the nonsensical default of trying to stick everything in an enum.

See http://bugs.mysql.com/2049

Arjen Lentz

Well done!

Shlomi Noach

Well done!

Shlomi Noach

Thanks for all the comments!

@Arjen: I'll add these limits to my todo. I followed your link and couldn't help to laugh at the first reply you got. It's no wonder there are so many MySQL forks out there.

fernando

Thanks for all the comments!

@Arjen: I'll add these limits to my todo. I followed your link and couldn't help to laugh at the first reply you got. It's no wonder there are so many MySQL forks out there.

fernando

Good work.

With regard to overused indexes, you should note that an index on (a,b) and on (a,c) have nothing useless between them. But your code will report "a" as being over indexed.

Shlomi Noach

Good work.

With regard to overused indexes, you should note that an index on (a,b) and on (a,c) have nothing useless between them. But your code will report "a" as being over indexed.

Shlomi Noach

Thanks a lot for the observation, that didn't occur to me :)

I'll have to look just for columns that have count(*) > 1 and that are also indexed alone, this would be the useless index (of course, it might be that one index takes duplicates, the other not, I don't know if that still makes all cases useless, you've really opened a can of worms for me :)

Fortunately we've got the thing at bitbucket now so I'll report this as a bug and begin working on it ASAP!

Fernando Ipar

Thanks a lot for the observation, that didn't occur to me :)

I'll have to look just for columns that have count(*) > 1 and that are also indexed alone, this would be the useless index (of course, it might be that one index takes duplicates, the other not, I don't know if that still makes all cases useless, you've really opened a can of worms for me :)

Fortunately we've got the thing at bitbucket now so I'll report this as a bug and begin working on it ASAP!

Fernando Ipar

Hi,
Still not complete. If the the index on (a) is UNIQUE, then an index (a,b) does make it redundant.
Since there's already a utility which checks for duplicate indexes, and it's quite complicated, I suggest you take a look how it does it. It is part of maatkit, and is called mk-duplicate-key-checker. See http://code.google.com/p/maatkit/wiki/DeterminingDuplicateKeys

Shlomi Noach

Hi,
Still not complete. If the the index on (a) is UNIQUE, then an index (a,b) does make it redundant.
Since there's already a utility which checks for duplicate indexes, and it's quite complicated, I suggest you take a look how it does it. It is part of maatkit, and is called mk-duplicate-key-checker. See http://code.google.com/p/maatkit/wiki/DeterminingDuplicateKeys

Shlomi Noach

Thanks for the additional suggestion, as I said, I expected that kind of problems. I'll look into that particular maatkit script for ideas, I don't intend to reinvent the wheel but rather to present DBAs with some useful screenshot of information, but it has to be useful and properly obtained :)

Fernando Ipar

Thanks for the additional suggestion, as I said, I expected that kind of problems. I'll look into that particular maatkit script for ideas, I don't intend to reinvent the wheel but rather to present DBAs with some useful screenshot of information, but it has to be useful and properly obtained :)

Fernando Ipar

Sorry but:

1. it can handle referential integrity
2. you don't have to create the XML file, just use schema grabber

See http://pabloj.blogspot.com/...

pabloj

Sorry but:

1. it can handle referential integrity
2. you don't have to create the XML file, just use schema grabber

See http://pabloj.blogspot.com/...

pabloj

Hey, no need to be sorry, that's good news :)
My mistake was going straight to the configuration section of the manual. Since foreign keys are handles automatically according to this: http://dbmonster.kernelpani..., there's no configuration key/option for them.
I'll run new tests taking this into consideration and either update this post or write a new one.

Thanks for the info and the link!

fernando

Hey, no need to be sorry, that's good news :)
My mistake was going straight to the configuration section of the manual. Since foreign keys are handles automatically according to this: http://dbmonster.kernelpani..., there's no configuration key/option for them.
I'll run new tests taking this into consideration and either update this post or write a new one.

Thanks for the info and the link!

fernando

Very Nice!

I have been trying to find a way to prevent Apache from spawning too many child processes whenever some bad robot or hacker throws a bunch of concurrent requests at our website.

This may just do the trick...

Edward Hall

Very Nice!

I have been trying to find a way to prevent Apache from spawning too many child processes whenever some bad robot or hacker throws a bunch of concurrent requests at our website.

This may just do the trick...

Edward Hall

Glad you found it useful!

This might be of interest to you too: http://www.cohprog.com/mod_..., though I don't know if it's still active, or if it works across apache v 1.x and 2.x.

The good side of the iptables approach is that it can be used to limit concurrent connections to any network based service

Good luck,

fernando

Glad you found it useful!

This might be of interest to you too: http://www.cohprog.com/mod_..., though I don't know if it's still active, or if it works across apache v 1.x and 2.x.

The good side of the iptables approach is that it can be used to limit concurrent connections to any network based service

Good luck,

fernando

Indeed, I ran another test and it works exactly as described:

mysql -e 'create database rtest'
[... create some inodb tables on rtest, with foreign key constraints ... ]

java pl.kernelpanic.dbmonster.Launcher -c rtest.properties --grab > rtest.xml

After properly editing the begining of the rtest.xml file (since it will included some non-xml output from dbmonster):

java pl.kernelpanic.dbmonster.Launcher -c rtest.properties -s rtest.xml

This successfully generates rows for all tables, satisfying referential integrity :)

fernando

Indeed, I ran another test and it works exactly as described:

mysql -e 'create database rtest'
[... create some inodb tables on rtest, with foreign key constraints ... ]

java pl.kernelpanic.dbmonster.Launcher -c rtest.properties --grab > rtest.xml

After properly editing the begining of the rtest.xml file (since it will included some non-xml output from dbmonster):

java pl.kernelpanic.dbmonster.Launcher -c rtest.properties -s rtest.xml

This successfully generates rows for all tables, satisfying referential integrity :)

fernando

Thanks for this article, it was really helpful! =)

Julio

Thanks for this article, it was really helpful! =)

Julio

i hope that u continue on posting more samples of programs... thnx!

chris

i hope that u continue on posting more samples of programs... thnx!

chris

Thanks for the message chris.
I've been neglecting the blog due to work lately, but I'm sure coming back to it.
I'm standing by for OpenSQL Camp in Portland with my Percona team mates right now, that should give me some nice blogging material for my return :)

Fernando

Thanks for the message chris.
I've been neglecting the blog due to work lately, but I'm sure coming back to it.
I'm standing by for OpenSQL Camp in Portland with my Percona team mates right now, that should give me some nice blogging material for my return :)

Fernando

Thanks for an insight into a very thorny problem I am tackling in my latest project. Now, I need to grok this and do some testing.

JonB

Thanks for an insight into a very thorny problem I am tackling in my latest project. Now, I need to grok this and do some testing.

JonB

Hm, what if there is a user-defined function (UDF) which may have side effects?

explain select id from projects where id = (select my_udf(max(id)) from projects where name like 'en%');

Would this EXPLAIN then actually call my_udf()? Seems a bit scary (but maybe one should expect that?)

Kristian Nielsen

Hm, what if there is a user-defined function (UDF) which may have side effects?

explain select id from projects where id = (select my_udf(max(id)) from projects where name like 'en%');

Would this EXPLAIN then actually call my_udf()? Seems a bit scary (but maybe one should expect that?)

Kristian Nielsen

Kristian:

I've poked around the source before to try and figure things out, but code for handling explain seems to be a bit distributed on the code base, so it takes more than just a few minutes reading to understand it.

So that leaves me with my second best guess: strace.

Take a look at this:

(I'm using lookup from udf_example.so)

mysql> select lookup(url) from urls;
+---------------+
| lookup(url) |
+---------------+
| 69.80.205.238 |
| 69.89.31.108 |
| 69.80.205.238 |
+---------------+
3 rows in set (0.09 sec)

And here's the strace relevant for the lookup:

[pid 8205] 0.000120 write(3, "100127 16:40:23\t 9 Query select lookup(url) from urls\n"..., 65) = 65
...
[pid 8205] 0.000878 open("/etc/resolv.conf", O_RDONLY) = 37
[pid 8205] 0.000179 fstat(37, {st_mode=S_IFREG|0644, st_size=240, ...}) = 0

Now with explain:

mysql> explain select lookup(url) from urls;
+----+-------------+-------+------+---------------+------+---------+------+------+-------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+------+---------+------+------+-------+
| 1 | SIMPLE | urls | ALL | NULL | NULL | NULL | NULL | 3 | |
+----+-------------+-------+------+---------------+------+---------+------+------+-------+
1 row in set (0.00 sec)

[pid 8205] 0.000137 write(3, "100127 16:40:35\t 9 Query explain select lookup(url) from urls\n"..., 73) = 73
[pid 8205] 0.000185 sched_setscheduler(8205, SCHED_OTHER, { 6 }) = -1 EINVAL (Invalid argument)
[pid 8205] 0.000280 gettimeofday({1264617635, 884763}, NULL) = 0
[pid 8205] 0.000091 gettimeofday({1264617635, 884852}, NULL) = 0
[pid 8205] 0.000266 write(29, "\x01\x00\x00\x01\x0a\x18\x00\x00\x02\x03\x64\x65\x66\x00\x00\x00\x02\x69\x64\x00\x0c\x3f\x00\x03\x00\x00\x00\x08\xa1\x00
\x00\x00\x00\x21\x00\x00\x03\x03\x64\x65\x66\x00\x00\x00\x0b\x73\x65\x6c\x65\x63\x74\x5f\x74\x79\x70\x65\x00\x0c\x08\x00\x13\x00\x00\x00\xfd\x01\x00\x1f\x00\
x00\x1b\x00\x00\x04\x03\x64\x65\x66\x00\x00\x00\x05\x74\x61\x62\x6c\x65\x00\x0c\x08\x00\x40\x00\x00\x00\xfd\x00\x00\x1f\x00\x00\x1a\x00\x00\x05\x03\x64\x65\x
66\x00\x00\x00\x04\x74\x79\x70\x65\x00\x0c\x08\x00\x0a\x00\x00\x00\xfd\x00\x00\x1f\x00\x00\x23\x00\x00\x06\x03\x64\x65\x66\x00\x00\x00\x0d\x70\x6f\x73\x73\x6
9\x62\x6c\x65\x5f\x6b\x65\x79\x73\x00\x0c\x08\x00\x00\x10\x00\x00\xfd\x00\x00\x1f\x00\x00\x19\x00\x00\x07\x03\x64\x65\x66\x00\x00\x00\x03\x6b\x65\x79\x00\x0c
\x08\x00\x40\x00\x00\x00\xfd\x00\x00\x1f\x00\x00\x1d\x00\x00\x08\x03\x64\x65\x66\x00\x00\x00\x07\x6b\x65\x79\x5f\x6c\x65\x6e\x00\x0c\x08\x00\x00\x10\x00\x00\
xfd\x00\x00\x1f\x00\x00\x19\x00\x00\x09\x03\x64\x65\x66\x00\x00\x00\x03\x72\x65\x66\x00\x0c\x08\x00\x00\x04\x00\x00\xfd\x00\x00\x1f\x00\x00\x1a\x00\x00\x0a\x
03\x64\x65\x66\x00\x00\x00\x04\x72\x6f\x77\x73\x00\x0c\x3f\x00\x0a\x00\x00\x00\x08\xa0\x00\x00\x00\x00\x1b\x00\x00\x0b\x03\x64\x65\x66\x00\x00\x00\x05\x45\x7
8\x74\x72\x61\x00\x0c\x08\x00\xff\x00\x00\x00\xfd\x01\x00\x1f\x00\x00\x05\x00\x00\x0c\xfe\x00\x00\x02\x00\x19\x00\x00\x0d\x01\x31\x06\x53\x49\x4d\x50\x4c\x45
\x04\x75\x72\x6c\x73\x03\x41\x4c\x4c\xfb\xfb\xfb\xfb\x01\x33\x00\x05\x00\x00\x0e\xfe\x00\x00\x02\x00"..., 369) = 369
[pid 8205] 0.000457 gettimeofday({1264617635, 885575}, NULL) = 0
[pid 8205] 0.000092 sched_setscheduler(8205, SCHED_OTHER, { 8 }) = -1 EINVAL (Invalid argument)
[pid 8205] 0.000103 gettimeofday({1264617635, 885770}, NULL) = 0
[pid 8205] 0.000084 gettimeofday({1264617635, 885853}, NULL) = 0
[pid 8205] 0.000086 gettimeofday({1264617635, 885939}, NULL) = 0
[pid 8205] 0.000116 gettimeofday({1264617635, 886058}, NULL) = 0
[pid 8205] 0.000130 write(4, "# Time: 100127 16:40:35\n# [email protected]: root[root] @ fernandoipar.com []\n# Thread_id: 9 Schema: test\n# Query_time: 0.001985 Lo
ck_time: 0.000690 Rows_sent: 1 Rows_examined: 0 Rows_affected: 0 Rows_read: 1\nexplain select lookup(url) from urls;\n"..., 238) = 238
...

To discard a possible dns caching effect, I ran the original query twice on the same connection, and got the same strace output, so it's always going to /etc/resolv.conf

Still, without looking at the source code, this is just anecdotical evidence from one example, I'll try to look more into this and perhaps make a new post on the subject :)

Fernando Ipar

Kristian:

I've poked around the source before to try and figure things out, but code for handling explain seems to be a bit distributed on the code base, so it takes more than just a few minutes reading to understand it.

So that leaves me with my second best guess: strace.

Take a look at this:

(I'm using lookup from udf_example.so)

mysql> select lookup(url) from urls;
+---------------+
| lookup(url) |
+---------------+
| 69.80.205.238 |
| 69.89.31.108 |
| 69.80.205.238 |
+---------------+
3 rows in set (0.09 sec)

And here's the strace relevant for the lookup:

[pid 8205] 0.000120 write(3, "100127 16:40:23\t 9 Query select lookup(url) from urls\n"..., 65) = 65
...
[pid 8205] 0.000878 open("/etc/resolv.conf", O_RDONLY) = 37
[pid 8205] 0.000179 fstat(37, {st_mode=S_IFREG|0644, st_size=240, ...}) = 0

Now with explain:

mysql> explain select lookup(url) from urls;
+----+-------------+-------+------+---------------+------+---------+------+------+-------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+------+---------+------+------+-------+
| 1 | SIMPLE | urls | ALL | NULL | NULL | NULL | NULL | 3 | |
+----+-------------+-------+------+---------------+------+---------+------+------+-------+
1 row in set (0.00 sec)

[pid 8205] 0.000137 write(3, "100127 16:40:35\t 9 Query explain select lookup(url) from urls\n"..., 73) = 73
[pid 8205] 0.000185 sched_setscheduler(8205, SCHED_OTHER, { 6 }) = -1 EINVAL (Invalid argument)
[pid 8205] 0.000280 gettimeofday({1264617635, 884763}, NULL) = 0
[pid 8205] 0.000091 gettimeofday({1264617635, 884852}, NULL) = 0
[pid 8205] 0.000266 write(29, "\x01\x00\x00\x01\x0a\x18\x00\x00\x02\x03\x64\x65\x66\x00\x00\x00\x02\x69\x64\x00\x0c\x3f\x00\x03\x00\x00\x00\x08\xa1\x00
\x00\x00\x00\x21\x00\x00\x03\x03\x64\x65\x66\x00\x00\x00\x0b\x73\x65\x6c\x65\x63\x74\x5f\x74\x79\x70\x65\x00\x0c\x08\x00\x13\x00\x00\x00\xfd\x01\x00\x1f\x00\
x00\x1b\x00\x00\x04\x03\x64\x65\x66\x00\x00\x00\x05\x74\x61\x62\x6c\x65\x00\x0c\x08\x00\x40\x00\x00\x00\xfd\x00\x00\x1f\x00\x00\x1a\x00\x00\x05\x03\x64\x65\x
66\x00\x00\x00\x04\x74\x79\x70\x65\x00\x0c\x08\x00\x0a\x00\x00\x00\xfd\x00\x00\x1f\x00\x00\x23\x00\x00\x06\x03\x64\x65\x66\x00\x00\x00\x0d\x70\x6f\x73\x73\x6
9\x62\x6c\x65\x5f\x6b\x65\x79\x73\x00\x0c\x08\x00\x00\x10\x00\x00\xfd\x00\x00\x1f\x00\x00\x19\x00\x00\x07\x03\x64\x65\x66\x00\x00\x00\x03\x6b\x65\x79\x00\x0c
\x08\x00\x40\x00\x00\x00\xfd\x00\x00\x1f\x00\x00\x1d\x00\x00\x08\x03\x64\x65\x66\x00\x00\x00\x07\x6b\x65\x79\x5f\x6c\x65\x6e\x00\x0c\x08\x00\x00\x10\x00\x00\
xfd\x00\x00\x1f\x00\x00\x19\x00\x00\x09\x03\x64\x65\x66\x00\x00\x00\x03\x72\x65\x66\x00\x0c\x08\x00\x00\x04\x00\x00\xfd\x00\x00\x1f\x00\x00\x1a\x00\x00\x0a\x
03\x64\x65\x66\x00\x00\x00\x04\x72\x6f\x77\x73\x00\x0c\x3f\x00\x0a\x00\x00\x00\x08\xa0\x00\x00\x00\x00\x1b\x00\x00\x0b\x03\x64\x65\x66\x00\x00\x00\x05\x45\x7
8\x74\x72\x61\x00\x0c\x08\x00\xff\x00\x00\x00\xfd\x01\x00\x1f\x00\x00\x05\x00\x00\x0c\xfe\x00\x00\x02\x00\x19\x00\x00\x0d\x01\x31\x06\x53\x49\x4d\x50\x4c\x45
\x04\x75\x72\x6c\x73\x03\x41\x4c\x4c\xfb\xfb\xfb\xfb\x01\x33\x00\x05\x00\x00\x0e\xfe\x00\x00\x02\x00"..., 369) = 369
[pid 8205] 0.000457 gettimeofday({1264617635, 885575}, NULL) = 0
[pid 8205] 0.000092 sched_setscheduler(8205, SCHED_OTHER, { 8 }) = -1 EINVAL (Invalid argument)
[pid 8205] 0.000103 gettimeofday({1264617635, 885770}, NULL) = 0
[pid 8205] 0.000084 gettimeofday({1264617635, 885853}, NULL) = 0
[pid 8205] 0.000086 gettimeofday({1264617635, 885939}, NULL) = 0
[pid 8205] 0.000116 gettimeofday({1264617635, 886058}, NULL) = 0
[pid 8205] 0.000130 write(4, "# Time: 100127 16:40:35\n# [email protected]: root[root] @ fernandoipar.com []\n# Thread_id: 9 Schema: test\n# Query_time: 0.001985 Lo
ck_time: 0.000690 Rows_sent: 1 Rows_examined: 0 Rows_affected: 0 Rows_read: 1\nexplain select lookup(url) from urls;\n"..., 238) = 238
...

To discard a possible dns caching effect, I ran the original query twice on the same connection, and got the same strace output, so it's always going to /etc/resolv.conf

Still, without looking at the source code, this is just anecdotical evidence from one example, I'll try to look more into this and perhaps make a new post on the subject :)

Fernando Ipar

it is possible to see the "bad list" ?

antonio

it is possible to see the "bad list" ?

antonio

Can you update us on how you did on the exam? Any particular areas we should focus on in our study?

Onjefu

Can you update us on how you did on the exam? Any particular areas we should focus on in our study?

Onjefu

It was very well written article - I liked that You showed the executed commands along with the tables, so readers can learn and try those commands on their own.

Thanks!

rauni

It was very well written article - I liked that You showed the executed commands along with the tables, so readers can learn and try those commands on their own.

Thanks!

rauni

Fernando,

I wonder if you would get the same performance results if you exhaust the enum data type with up to 2^16-1 elements. I mean, what happens, if you have many different types, would it be better to pre-define all possible combinations in the scheme or would it be better to use a normalized scheme? Just from a performance point of view of course, elegance is somewhat different :D

But I think the difference would be negigible because you only shift the lookup problem.

Robert

Robert

Fernando,

I wonder if you would get the same performance results if you exhaust the enum data type with up to 2^16-1 elements. I mean, what happens, if you have many different types, would it be better to pre-define all possible combinations in the scheme or would it be better to use a normalized scheme? Just from a performance point of view of course, elegance is somewhat different :D

But I think the difference would be negigible because you only shift the lookup problem.

Robert

Robert

Hi Tocayo,

Greetings from across the pond. Do you know if “Proactive measures to minimize MySQL downtime" will be recorded? Or, at the very least, will the presentation (odf or whatever) be made online?

Kudos and good luck
FC

Fernando Cassia (@fcassia)

Hi Tocayo,

Greetings from across the pond. Do you know if “Proactive measures to minimize MySQL downtime" will be recorded? Or, at the very least, will the presentation (odf or whatever) be made online?

Kudos and good luck
FC

Fernando Cassia (@fcassia)

Woot! See you there!

Ronald Bradford

Woot! See you there!

Ronald Bradford

youtube.com @Fernando: I honestly don't know :) There is a recording of a similar (english) presentation I delivered as webinar, here: http://www.youtube.com/watc...

Thanks for the good luck, and hope to see you there!

fernando

youtube.com @Fernando: I honestly don't know :) There is a recording of a similar (english) presentation I delivered as webinar, here: http://www.youtube.com/watc...

Thanks for the good luck, and hope to see you there!

fernando

@Ronald: I was on vacation other times you came nearby, so I'm glad I will get to you see you this time! If you go on a latin american tour again, ping me if you'd like to present at the Montevideo MySQL Meetup. The group would surely love that!

fernando

@Ronald: I was on vacation other times you came nearby, so I'm glad I will get to you see you this time! If you go on a latin american tour again, ping me if you'd like to present at the Montevideo MySQL Meetup. The group would surely love that!

fernando

Hi Fernando,
This seems like a great tool but I'm getting an error -
ERROR 1221 (HY000): Incorrect usage of PROCEDURE and non-SELECT
I'm on the latest 5.6.x, maybe that's got something to do with it.
Any ideas?
Thanks

Alex

Hi Fernando,
This seems like a great tool but I'm getting an error -
ERROR 1221 (HY000): Incorrect usage of PROCEDURE and non-SELECT
I'm on the latest 5.6.x, maybe that's got something to do with it.
Any ideas?
Thanks

Alex

Hello Alex,

You're right, this stopped working on 5.6 Not just the stored procedure, but apparently any use of procedure analyse() other than a select.

So far I have only found a changelog mention for 5.6.6 about this, but I tired 5.6.5 and this did not work either so the change must have happened before.

I'll dig around a bit to see what I can find.

fernando

Hello Alex,

You're right, this stopped working on 5.6 Not just the stored procedure, but apparently any use of procedure analyse() other than a select.

So far I have only found a changelog mention for 5.6.6 about this, but I tired 5.6.5 and this did not work either so the change must have happened before.

I'll dig around a bit to see what I can find.

fernando

This has been extremely useful, thank you!

Jason

This has been extremely useful, thank you!

Jason

mysql> insert into procedure_analyse_output select * from pet procedure analyse()
-> ;
ERROR 1221 (HY000): Incorrect usage of PROCEDURE and non-SELECT

root

mysql> insert into procedure_analyse_output select * from pet procedure analyse()
-> ;
ERROR 1221 (HY000): Incorrect usage of PROCEDURE and non-SELECT

root

hello fernando,
it can work this way:
mysql> select * from pet procedure analyse() into OUTFILE "ana.txt";
mysql> load data infile "ana.txt" into procedure_analyse_output;
than you la

root

hello fernando,
it can work this way:
mysql> select * from pet procedure analyse() into OUTFILE "ana.txt";
mysql> load data infile "ana.txt" into procedure_analyse_output;
than you la

root

More detailed example w/graph:

https://gist.github.com/fip...

fernando

I always experienced lots of problems using MySQL ENUMs , and never going to use it again.

Tony

I always experienced lots of problems using MySQL ENUMs , and never going to use it again.

Tony

Nice write up! Thanks for the shout-out :)

Roland.

Roland Bouman

Nice write up! Thanks for the shout-out :)

Roland.

Roland Bouman

Thank you for this.

As an addition: since 5.6, GA more than 2 years ago, Oracle ships its MySQL packages with strict SQL defaults on the config file (people may not have noticed because of Linux distribution defaults or them using ancients versions) and compiled in defaults in 5.7.

The only reason the change is taking so long is due to backwards compatibility with old applications.

Other myth that people continue repeating over and over without really understanding the internals is that data is not really safe in MySQL: By default, MySQL is fully ACID, synchronous to disk, data is written twice to secondary storage to avoid partial page writes in case the system crashes, and mantains a checksum for every 16K bytes of information, shutting down automatically if it detects physical corruption. Despite that, it still beats some of the other systems in throughput and latency.

I would love if some people stopped attacking MySQL and instead praised their favourite DB system. Or better, stopped having a favourite system and praised the strong points and hated the weak ones of every option. Like programming languages, every option has its cons and pros.

Jaime Crespo

Thank you for this.

As an addition: since 5.6, GA more than 2 years ago, Oracle ships its MySQL packages with strict SQL defaults on the config file (people may not have noticed because of Linux distribution defaults or them using ancients versions) and compiled in defaults in 5.7.

The only reason the change is taking so long is due to backwards compatibility with old applications.

Other myth that people continue repeating over and over without really understanding the internals is that data is not really safe in MySQL: By default, MySQL is fully ACID, synchronous to disk, data is written twice to secondary storage to avoid partial page writes in case the system crashes, and mantains a checksum for every 16K bytes of information, shutting down automatically if it detects physical corruption. Despite that, it still beats some of the other systems in throughput and latency.

I would love if some people stopped attacking MySQL and instead praised their favourite DB system. Or better, stopped having a favourite system and praised the strong points and hated the weak ones of every option. Like programming languages, every option has its cons and pros.

Jaime Crespo

As a "Fan Boy" I have to say, Thanks for a good post. I'll go out on a limb and say... As a push by every App Dev to become a rock star and write their own data store, MySQL has fallen into the "Old School" bucket with Oracle. DBA's are being ask to for the "New Stuff" without being ask for their expertise. Things like the memcache interface to MySQL are simply ignored by developers.

Mark Grennan

As a "Fan Boy" I have to say, Thanks for a good post. I'll go out on a limb and say... As a push by every App Dev to become a rock star and write their own data store, MySQL has fallen into the "Old School" bucket with Oracle. DBA's are being ask to for the "New Stuff" without being ask for their expertise. Things like the memcache interface to MySQL are simply ignored by developers.

Mark Grennan

pgsql has bitmap join optimization, in that it can scan more than one b-tree, create a bitmap and logically combine them. This is faster than "merge intersection" that MySQL has. pgsql doesn't have real bitmap indexes to the best of my knowledge, unless perhaps they were added in the latest release and I missed them.

And if you want real bitmap indexes for MySQL, you can always use my FastBit_UDF. It is a little clunky, but very very fast.

Justin Swanhart

pgsql has bitmap join optimization, in that it can scan more than one b-tree, create a bitmap and logically combine them. This is faster than "merge intersection" that MySQL has. pgsql doesn't have real bitmap indexes to the best of my knowledge, unless perhaps they were added in the latest release and I missed them.

And if you want real bitmap indexes for MySQL, you can always use my FastBit_UDF. It is a little clunky, but very very fast.

Justin Swanhart

Justin, I was thinking of the Bizgres patches that did/do provide on-disk bitmap indexes, but it seems it never made it into a stable version (I'll admit I haven't seriously played with PostgreSQL in a while...)

fernando

Justin, I was thinking of the Bizgres patches that did/do provide on-disk bitmap indexes, but it seems it never made it into a stable version (I'll admit I haven't seriously played with PostgreSQL in a while...)

fernando

Sadly, when MySQL violates the policy of least surprise it is not surprising.

Justin Swanhart

Perhaps this is obvious, but what are the implications?

Certainly, if the majority of a tuple's access was "ORDER BY indexed_field DESC", there would be a performance hit, but are there other cases where it is really important to have descending indexes?

$58077795

The whole point of the DESC keyword is to support accessing the index in descending order more efficiently. If the database silently ignores this request, it sucks.

Much worse is that you can define CHECK constraints, and they parse correctly, but they do nothing. Or on MyISAM you can define foreign keys and you get no error, but they are not created. Or a myriad of other toy database behaviors that make me wonder why I ever picked up MySQL until I realize it was forced upon me and somehow I just ended up with Stockholm Syndrome for it.

Justin Swanhart

That last comment of yours ought to make it to a t-shirt some day!

Fernando Ipar

I have not dived in this subject in details, but my guess is that InnoDB does not support DESC indexes, but another storage engine could. So it is a good thing that MySQL support the syntax. However, it should at least give a warning, and maybe even an error in 5.5 and 5.6. Did you open a bugs about this ?

Jean-François Gagné

"The whole point of the DESC keyword is to support accessing the index in descending order more efficiently."

That is not the whole point. I think it is more important that it makes it possible to use a multi-column index for ordering even when ordering is a mix of ASC and DESC.

Øystein Grøvlen

You're right that the rationale here would be differences in support by storage engine, as with foreign keys, but in this case, before MySQL 8.0, it does not really matter since MySQL itself ignores the modifiers so even if an engine would support those, it wouldn't get used. I thought about filing a bug report but since it works as expected on 8.0 and I would not expect this to be changed in earlier versions, I didn't do it.

Fernando Ipar

That was the point, but we can split hairs if you want. If I use an ORDER BY that uses the ASC/DESC keywords as defined in the index, I expect that ORDER BY to be efficient and use the index as I defined it, which is more efficient than a filesort. But since the index ignores the order, the database can not efficiently execute the query that I specifically asked the database to optimize.

Justin Swanhart

It should at least give a warning similar to the optimize table warning on InnoDB

Daniël van Eeden

Ever notice how SHOW PROFILES always has 99+% of the time in "Sending data"? That makes me wonder if there is any use for SHOW PROFILES. What did you hope to show with it? (Meanwhile, please wrap queries; they are hard to read when scrolling is needed.)

Rick James