The Linux Bloke

Who's the Biggest Geek on the Internet?

Browsing Posts in PHP

I run a number of web sites with WordPress, and have recently upgraded many of them to WordPress 3.0.1. It seems that every time I have to upgrade WordPress that some of the many and varied plugins are simply broken.

Why is this? I mean, in this day and age of well-defined APIs, OOP, proper object factorization, unit testing, and the like, you’d think this would rarely happen, right?

But, alas, the real problem is that most WordPress plugins are simply horribly written. Period. Way too much dependence on global variables, no use of PHP’s OOP capabilities (even though WordPress definitely supports OOP), and just poor code organization all around.

So, whenever WordPress changes something, lots of plugins simply fail to work properly. It’s so prevalent it drives me mad.

Proper software design principles applies whether you are programming in C++, Ruby, Perl, Java, or PHP. We have those design principles in place for a reason. Popular applications are never static, and should never be expected to be either. That’s why we do certain things in a certain way, folks!

Of all the languages I know, I consider PHP to be the absolute worst, basically the Basic of the 21st century. Because just like Basic, PHP allows to get away with many ills. It allows you to write very sloppy code very quickly, and actually get it to work “good enough” to throw into production!!!!

But when you are talking frameworks and plugins, there is no room for being sloppy. When you are talking millions of users that must rely on your plugins, there are simply no place for taking shortcuts. You do the job right, or not at all. You adhere to the well-established, sound design principles that we have worked out over the past 30 years or so, or you go back to “school” to get a clue.

But I really fault PHP for allowing such evils in the first place. Ruby and Python strongly encourages you to do the right thing when you write code. Ruby on Rails is sweet in this regard.

In my experience (and I have 30 years of it!), it takes just as long to write good code as it does to write bad code. With bad code, you spend much more time debugging it, and “fixing” the bugs probably entails writing more sloppy code to work around the existing sloppiness. So time wise, you’re a penny wise, pound foolish.

And then comes maintainability. With poorly written code, forget it. The time it takes to maintain it blossoms exponentially over time as the underpinnings shift and evolve over time. With Open Source development, what usually happens over time is one of the following:

  1. The code is abandoned and everyone stops using it.
  2. The code is re-written from the ground up (and using proper design principles finally!!!!)
  3. The code has become critical to many applications, but no one truly wants to maintain it because it’s so horrible, so it “limps along” with the barest minimum effort applied just to keep it — somewhat — running.

All of which could’ve been avoided if the code was written properly up front. That would free developers to work on more cool stuff, giving us even more functionality, and also allow the underlying frameworks to also grow and expand without worry of breaking all the plugins and themes out there.

So get a clue you bad PHP code slingers out there! It’s not hard at all writing good code, and is actually quite enjoyable. Spend less time playing video games and more time to educate yourself. It’ll look good on your resume and improve your bottom line. And make those who use your code happier. Why? Because your code won’t call attention to itself by not working, and your name is far less likely to become an expletive.

It’s up to you. Only You can write Good Code. If not you, then who else?

Let’s say you need to do a website that must support multiple languages for cultures as diverse as Japan, France, Russia, Saudi Arabia, and Brazil, as well as the US. This can be quite a daunting task, with all kinds of unexpected gotchas.

The ideal character set of choice is, of course, UTF8. Alas, you will note that most of the systems you’ll need to use defaults to LATIN1, including MySQL. If your site is written in PHP, that also by default is set to LATIN1.

I find it quite puzzling that in this day and age of globalization that many of the tools don’t default to UTF8. And there are major issues with this, because everything in the chain of delivery must either be set to UTF8 or can handle UTF8 or you’ll see bizarreness when you attempt to display the characters of some languages. You will probably see a series of question marks (“??? ??? ?????”) instead of the actual words. Sometimes you may see a series of squares. Or maybe it looks like total garbage.

To debug charset issues, you must be certain that everything in the delivery chain is set for UTF8. I can’t stres this enough.

For example, on one project, the MySQL database was properly set to UTF8, but we kept seeing LATIN1 creep in from somewhere. The site was driven by PHP, and we made sure PHP was set to UTF8, but there were still issues. It turned out that PDO/mysqli was still defaulting to LATIN1, which was revealed by looking at the results of the following query issued through PHP:

SHOW VARIABLES LIKE “character%”;

Which should result in:

+--------------------------+----------------------------+
| Variable_name            | Value                      |
+--------------------------+----------------------------+
| character_set_client     | utf8                       |
| character_set_connection | utf8                       |
| character_set_database   | utf8                       |
| character_set_filesystem | binary                     |
| character_set_results    | utf8                       |
| character_set_server     | utf8                       |
| character_set_system     | utf8                       |
| character_sets_dir       | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+

But instead we saw:

+--------------------------+----------------------------+
| Variable_name            | Value                      |
+--------------------------+----------------------------+
| character_set_client     | latin1                     |
| character_set_connection | latin1                     |
| character_set_database   | utf8                       |
| character_set_filesystem | binary                     |
| character_set_results    | latin1                     |
| character_set_server     | utf8                       |
| character_set_system     | utf8                       |
| character_sets_dir       | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+

Clearly demonstrating that there was a connection issue. However, we were able to, as a quick fix, issue the following query on that connection:

SET NAMES ‘utf8′;

Which fixed the problem, though requiring us to run that query on every new connection. I am sure there is a better approach, but we didn’t have time to find it.

But to give you an idea, this is the chain we had to check for UTF8:

  • MySQL Server
  • MySQL Driver/PDO Wrapper
  • PHP
  • Browser

If you are interacting with MySQL through the command-line client, then make sure you launch it thusly:

mysql –default-character-set=utf8

Or have the appropriate settings in the [client] section of my.cnf.

The character set headaches are not just limited to MySQL, but any interacting systems, web services, etc. Carefully checking the chain to ensure that every part of that chain defaults to UTF8 is essential to saving the day for the world of localized globalization!