<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Fernando Ipar &#187; datatypes</title>
	<atom:link href="http://fernandoipar.com/tag/datatypes/feed/" rel="self" type="application/rss+xml" />
	<link>http://fernandoipar.com</link>
	<description>I love mankind! Its people I can&#039;t stand!</description>
	<lastBuildDate>Wed, 07 Jul 2010 04:10:33 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=abc</generator>
		<item>
		<title>Making use of procedure analyse()</title>
		<link>http://fernandoipar.com/2009/04/18/making-use-of-procedure-analyse/</link>
		<comments>http://fernandoipar.com/2009/04/18/making-use-of-procedure-analyse/#comments</comments>
		<pubDate>Sat, 18 Apr 2009 19:47:48 +0000</pubDate>
		<dc:creator>fernando</dc:creator>
				<category><![CDATA[MySQL]]></category>
		<category><![CDATA[datatypes]]></category>
		<category><![CDATA[optimization]]></category>

		<guid isPermaLink="false">http://fernandoipar.com/?p=155</guid>
		<description><![CDATA[SELECT Field0[,Field1,Field2,...] FROM TABLE PROCEDURE ANALYSE() is a nice tool to find out more about your table&#8217;s columns. Still, it could be improved in a lot of ways, and the stored procedure below is a starting point. It makes use of procedure analyse (though with &#8216;SELECT * FROM&#8217;), and modifies it&#8217;s output to include the [...]


Related posts:<ol><li><a href='http://fernandoipar.com/2009/04/21/extending-procedure_analyse/' rel='bookmark' title='Permanent Link: Extending procedure_analyse'>Extending procedure_analyse</a> <small>My previous post explored a stored procedure that extended procedure_analyse...</small></li><li><a href='http://fernandoipar.com/2009/08/12/indexing-text-columns-in-mysql/' rel='bookmark' title='Permanent Link: Indexing text columns in MySQL'>Indexing text columns in MySQL</a> <small>This time, I&#8217;m talking about indexes for string typed columns....</small></li><li><a href='http://fernandoipar.com/2009/04/29/soundex-triggers-and-stored-procedures/' rel='bookmark' title='Permanent Link: SOUNDEX(), triggers, and stored procedures'>SOUNDEX(), triggers, and stored procedures</a> <small>MySQL provides a SOUNDEX() function, which returns the soundex of...</small></li></ol>

Related posts brought to you by <a href='http://mitcho.com/code/yarpp/'>Yet Another Related Posts Plugin</a>.]]></description>
			<content:encoded><![CDATA[<p>SELECT Field0[,Field1,Field2,...] FROM TABLE PROCEDURE ANALYSE() is a nice tool to find out more about your table&#8217;s columns.</p>
<p>Still, it could be improved in a lot of ways, and the stored procedure below is a starting point. It makes use of <em>procedure analyse</em> (though with &#8216;SELECT * FROM&#8217;), and modifies it&#8217;s output to include the actual column datatype and the total number of rows of the table.</p>
<p>The actual datatype is a piece of information I&#8217;ve seen a lot of people request, and the number of rows is, I think, a critical piece of information to determine if the output of <em>procedure analyse</em> is credible or not. It&#8217;s not the same thing to take suggestions from mysql on a table with 7 or 20 rows than from a table with 1000000 rows. Of course, remember than numbers alone mean nothing, you might just have 7 rows in a table that represent the entire universe of possible values for your problem domain. In any case, a smart human being with a good toolset is the best way to solve problems!</p>
<p>So here&#8217;s the procedure, which can also be downloaded from this <a title="extended_procedure_analyse.sql" href="http://fernandoipar.com/extended_procedure_analyse.sql">link</a>:</p>
<pre>/*

extended procedure analyse
(C) 2009 Fernando Ipar
mail(at)fernandoipar.com

GPLv2

*/
drop procedure if exists extended_procedure_analyse;
delimiter //
create procedure extended_procedure_analyse(databaseName varchar(64), tableName varchar(64))
begin

	create temporary table procedure_analyse_output
	(
	Field_name varchar(64),
	Min_value int,
	Max_value int,
	Min_length int,
	Max_length int,
	Empties_or_zeros int,
	Nulls int,
	Avg_value_or_avg_length float,
	Std float,
	Optimal_fieldtype text,
	Actual_fieldtype text
	);	

	set @table = concat(databaseName,'.',tableName);
	set @dbName = databaseName;
	set @tbName = tableName;

	set @qry = concat('insert into procedure_analyse_output (Field_name,Min_value,Max_value,Min_length,Max_length,Empties_or_zeros,Nulls,Avg_Value_or_avg_length,Std,Optimal_fieldtype) select * from ', @table,' procedure analyse()');
	prepare myStmt from @qry;
	execute myStmt;

	update procedure_analyse_output set Field_name = replace(Field_name,  CONCAT(databaseName,'.',tableName,'.'),'');

	prepare myStmt from 'update procedure_analyse_output pao, information_schema.columns c set pao.Actual_Fieldtype = c.column_type where table_schema = ? and table_name = ? and column_name = pao.Field_name';
	execute myStmt using @dbName,@tbName;

	set @qry = concat('select count(*) as `Total_number_of_rows` from ',@table);
	prepare myStmt from @qry;

	execute myStmt;

	select * from procedure_analyse_output;

	drop temporary table procedure_analyse_output;

end;
//
delimiter ;</pre>
<p>Here are a couple of sample outputs:</p>
<pre>mysql&gt; call extended_procedure_analyse('test','City')\G
*************************** 1. row ***************************
Total_number_of_rows: 30000
1 row in set (0.11 sec)

*************************** 1. row ***************************
             Field_name: ID
              Min_value: 925001
              Max_value: 955000
             Min_length: 6
             Max_length: 6
       Empties_or_zeros: 0
                  Nulls: 0
Avg_value_or_avg_length: 940000
                    Std: 938839
      Optimal_fieldtype: MEDIUMINT(6) UNSIGNED NOT NULL
       Actual_fieldtype: int(11) unsigned
*************************** 2. row ***************************
             Field_name: CountryCode
              Min_value: 0
              Max_value: 29
             Min_length: 1
             Max_length: 2
       Empties_or_zeros: 1000
                  Nulls: 0
Avg_value_or_avg_length: 14.5
                    Std: 8.6554
      Optimal_fieldtype: ENUM('0','1','2','3','4','5','6','7','8','9','10','11','12','13','14','15','16','17','18','19','20','21','22','23','24','25','26','27','28','29') NOT NULL
       Actual_fieldtype: int(11) unsigned
*************************** 3. row ***************************
             Field_name: Name
              Min_value: 1
              Max_value: 9999
             Min_length: 1
             Max_length: 5
       Empties_or_zeros: 0
                  Nulls: 0
Avg_value_or_avg_length: 4.6605
                    Std: NULL
      Optimal_fieldtype: CHAR(5) NOT NULL
       Actual_fieldtype: varchar(40)
*************************** 4. row ***************************
             Field_name: District
              Min_value: 1
              Max_value: 9999
             Min_length: 1
             Max_length: 5
       Empties_or_zeros: 0
                  Nulls: 0
Avg_value_or_avg_length: 4.6603
                    Std: NULL
      Optimal_fieldtype: CHAR(5) NOT NULL
       Actual_fieldtype: varchar(40)
*************************** 5. row ***************************
             Field_name: Population
              Min_value: 0
              Max_value: 9999
             Min_length: 1
             Max_length: 5
       Empties_or_zeros: 0
                  Nulls: 0
Avg_value_or_avg_length: 4.6647
                    Std: NULL
      Optimal_fieldtype: CHAR(5) NOT NULL
       Actual_fieldtype: varchar(40)
5 rows in set (0.12 sec)

Query OK, 0 rows affected (0.12 sec)

mysql&gt; call extended_procedure_analyse('test','projects_innodb')\G
*************************** 1. row ***************************
Total_number_of_rows: 1007366
1 row in set (14.80 sec)

*************************** 1. row ***************************
             Field_name: id
              Min_value: 1
              Max_value: 1007366
             Min_length: 1
             Max_length: 7
       Empties_or_zeros: 0
                  Nulls: 0
Avg_value_or_avg_length: 503684
                    Std: 581599
      Optimal_fieldtype: MEDIUMINT(7) UNSIGNED NOT NULL
       Actual_fieldtype: int(10) unsigned
*************************** 2. row ***************************
             Field_name: name
              Min_value: 0
              Max_value: 9999
             Min_length: 1
             Max_length: 10
       Empties_or_zeros: 0
                  Nulls: 0
Avg_value_or_avg_length: 4.6958
                    Std: NULL
      Optimal_fieldtype: VARCHAR(10) NOT NULL
       Actual_fieldtype: char(10)
2 rows in set (14.80 sec)

Query OK, 0 rows affected (14.80 sec)

mysql&gt; call extended_procedure_analyse('test','projects_isam')\G
*************************** 1. row ***************************
Total_number_of_rows: 1000000
1 row in set (0.56 sec)

*************************** 1. row ***************************
             Field_name: id
              Min_value: 1
              Max_value: 1000000
             Min_length: 1
             Max_length: 7
       Empties_or_zeros: 0
                  Nulls: 0
Avg_value_or_avg_length: 500000
                    Std: 577358
      Optimal_fieldtype: MEDIUMINT(7) UNSIGNED NOT NULL
       Actual_fieldtype: int(10) unsigned
*************************** 2. row ***************************
             Field_name: name
              Min_value: 0
              Max_value: 9999
             Min_length: 1
             Max_length: 5
       Empties_or_zeros: 0
                  Nulls: 0
Avg_value_or_avg_length: 4.6605
                    Std: NULL
      Optimal_fieldtype: CHAR(5) NOT NULL
       Actual_fieldtype: char(10)
2 rows in set (0.56 sec)

Query OK, 0 rows affected (0.56 sec)</pre>
<p>Notice the difference in response time between the Innodb and MyISAM (yes, I wrongly used the &#8216;isam&#8217; name &#8230;) tables, that&#8217;s because Innodb has to calculate the number of rows for a count(*) query, while MyISAM stores a row count in the table.</p>
<p>I plan to extend this procedure to include index information, with useful data such as overindexed columns (columns that are included as a leftmost prefix in more than one index) and unindexed columns that are queried. The first question is easily answered from information_schema.statistics, using the seq_in_index column, I&#8217;m working on the second one.</p>
<p>Still, if you have the time and interest, play with this early version and let me know what&#8217;s wrong and/or could be improved with it.</p>


<p>Related posts:<ol><li><a href='http://fernandoipar.com/2009/04/21/extending-procedure_analyse/' rel='bookmark' title='Permanent Link: Extending procedure_analyse'>Extending procedure_analyse</a> <small>My previous post explored a stored procedure that extended procedure_analyse...</small></li><li><a href='http://fernandoipar.com/2009/08/12/indexing-text-columns-in-mysql/' rel='bookmark' title='Permanent Link: Indexing text columns in MySQL'>Indexing text columns in MySQL</a> <small>This time, I&#8217;m talking about indexes for string typed columns....</small></li><li><a href='http://fernandoipar.com/2009/04/29/soundex-triggers-and-stored-procedures/' rel='bookmark' title='Permanent Link: SOUNDEX(), triggers, and stored procedures'>SOUNDEX(), triggers, and stored procedures</a> <small>MySQL provides a SOUNDEX() function, which returns the soundex of...</small></li></ol></p>
<p>Related posts brought to you by <a href='http://mitcho.com/code/yarpp/'>Yet Another Related Posts Plugin</a>.</p>]]></content:encoded>
			<wfw:commentRss>http://fernandoipar.com/2009/04/18/making-use-of-procedure-analyse/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
	</channel>
</rss>
