Please, hard-code your literals

As Feuerstein says,


We all know that hard-coding is a bad thing in software. But most developers think of hard-coding simply as typing a literal value into your program whenever you need it.

So what’s wrong with doing this? Nothing – as long as the value is never going to change. But what’s the chance of that happening? In fact, what’s the chance of anything staying the same (never changing) in our application requirements and resulting code?

Almost nil.

I agree with most of what Steven says in this post, except for the “Almost nil” part. In my experience, developers who have slavishly converted all literal values in their applications to constants have made life much more difficult for subsequent maintenance and performance tuning.

I would contend that literal values should be hard-coded throughout the code when those values will not change, because they cannot change. The main example of this kind of value is that of hidden magical ID numbers, that crop up especially often in database designs featuring any kind of EAV pattern. These designs usually feature some kind of metadata table, e.g.:

PROPERTYTYPES (
  propertyid,
  code,
  description,
  datatype, length, etc.)

and sprinkled throughout the codebase, like sand in your toddler’s nappy after a trip to the beach, are statements like this:

SELECT value INTO custname
FROM propertyvalues
WHERE entityid = p1
AND propertyid = cNAME;

(where cNAME is a constant that happens to be set to 30456 or something like that)

It gets worse when they need more than one property about an entity:

SELECT a.value, b.value, c.value
INTO custname, custphone, custaddress
FROM propertyvalues a, propertyvalues b, propertyvalues c
WHERE a.entityid = p1 AND b.entityid = p1 and c.entityid = p1
AND a.propertyid = cNAME
AND b.propertyid = cPHONE
AND c.propertyid = cADDRESS;

Someone might say, “that’s good, isn’t it? The hardcoded literal values have been stored once and once only in the constant declarations, and they can be used everywhere. If we want to change a property ID we can change it in one place and everything still works.”

Wrong – for two reasons.

Firstly, it’s not just one place – if you change the property ID, you have to also change the data in all the tables that point to that property. You also have to change code in that frontend UI that can’t read those constants, or in that external process that inexplicably was written with its own logic around those particular ID values (and search-and-replace won’t work because they’ve written some braindead code like this: if propertyid > 30453 & propertyid < 30458 { ... }). The larger, more complex and widespread the codebase, the more it’s just not going to change.

Secondly, WHY? Why would you ever want to change these IDs? Much preferable to hard-code those ID values everywhere. If you stick to using constants where it makes sense (like the “maximum salary” that Steven had in his excellent example), then your hard-coded literals will tell future developers one important and life-preserving message:

“Do not change this code.”

Why EAV is bad, mostly; and why it is used so often
Lesson learned today

Comments

  1. I once worked at a place that had literals c_Y and c_N for 'Y' and 'N' respectively.Not only is it silly, but it also played merry havoc with query plans as the same plan got used for both.

  2. Jeff,Thanks. This is, in fact, one important point where the "best practices" of procedureal language programming (like JAVA, DOT NET, C++ etc.) should not be applied to SQL.BTW, I would like to add something to your following statementI would contend that literal values should be hard-coded throughout the code when those values will not change and do not set CURSOR_SHARING to FORCE

  3. I think one aspect why to use constants instead of hard coded values was not mentioned in the article. It produces self documenting code! It's a big difference if you have to readWHERE entityid = p1AND propertyid = 30456;instead ofWHERE entityid = p1AND propertyid = cNAME;Another aspect why to use constants is compiler checking. If you use a constant, the compiler will check that it really exists. If you use a hard coded value like a number or a string you could have a typo and you will never notice if you are not doing proper testing…Just my 2 centRegardsPatrick

  4. @Gary: thankfully I've personally only rarely encountered that extreme 🙂 – but yes, I didn't mention how often I've had problems with plans being reused for queries on quite different data.@Narendra: yes, and I'd extend it to PL/SQL too. Your point about CURSOR_SHARING is appropriate, if we're being forced to uses FORCE then it doesn't make much difference (performance-wise) whether we use constants or literals.@Patrick: yes, you bring up one of the biggest advantages to using constants everywhere, with which I largely agree. In my case, I compromise by religiously adding comments everywhere I use a magic literal, e.g.WHERE entityid = p1AND propertyid = 30456/*NAME*/;After a while, the number+comment gets ingrained somewhere in the brain and if one sees something like this…WHERE entityid = p1AND propertyid = 30457/*NAME*/;…it stands out ("30457? That's not NAME… Fetch me the data dictionary!"). Plus, on a side note, it makes it MUCH easier to take the SQL, containing 20+ bind variables, out of the cache and diagnose or tune it.Unfortunately the place I'm at currently have taken the "constants at all costs" approach entirely wrong, and we end up with code like this:WHERE entityid = p1AND propertytypecode = kNSCHRD;… and what, you might ask, is kNSCHRD?kNSCHRD CONSTANT VARCHAR2(10) := 'NSCHRD';!!!So mainly, my beef is with people that have taken the "no literals" rule literally and have used constants everywhere just to tick that box, without actually making the code any more maintainable.Footnote: I believe in a well-designed data model that doesn't use the EAV pattern, this sort of thing doesn't appear as often – there are less "magic values" around in the first place.

  5. Connor McDonald
    3 March 2010 - 5:12 am

    You can model all of this out (if you are that way inclined). For example, assuming that 'NSCHRD' has some "meaning" for the purpose of demo I'll call: "is_special"create table PROPERTY_TYPE_CODES( id varchar2(n), is_special varchar2(1));insert into PTC values ('NSCHRD','Y');insert into PTC values ('something else','N');then you query based on the indicators, egselect *from table t, PTCwhere t.id = ptc.idand ptc.is_special = 'Y'thus limiting your literals to 'Y' and 'N', and changing ID values becomes a meta-data exercise rather than coding exercise…(I'm just saying how it can done – I'm NOT an advocate!)

  6. Hi JeffI think the blanket rule here is stretched a bit in all use cases. I can see a couple reasons where it will fail in the case of the hardcoded property ID example 30453. For example say it was written into a query’s where clause:a) There are sites where their dev and test databases are complete copies of the production db including data. As such property ID 30453 is the same in all 3 environments and the hardcoded propertyId *will* run. Yet other sites exist where dev and test are limited copies of the production db only in terms of the meta data, but not the data (in other words, the tables are the same, but the data ain’t necessarily so, be it best practice or not). We can instantly see the issue if property ID 30453 is in production, but it is 7 in development and -45 in test. The hardcoded query is going to be incorrect in these cases.b) Another example is where we are running an update in several separate customers’ production databases for a shrink wrapped solution. Say for example the update scripts are run in one customer’s database table whose property table isn't yet populated and the associated sequence number is still at 1. In another customer’s database who has been lovely, bought all our software, applied all out patches, and the property table is heavily populated and the latest number for the associated sequence is over 30453. After running our import scripts, then creating new objects including a view with the hardcoded propertyId = 30543, we’ll hit problems in both cases. For the first customer presumably we’d insert the property record with ID 30543, but then have to increment the sequence number to jump to the max number. With the later customer, it’s more dire, as the customer has already consumed propertyId 30543, we’ll have to be satisfied with a later Id generated from the sequence, and our hardcoded id will have to change.Of course these issues don’t apply to all environments, just some.As pointed out by Connor et all, there are db modelling solutions, the obvious subquery option, or, the more obvious PLSQL constant solution.Regards,CM.

  7. Steven Feuerstein
    4 March 2010 - 1:21 pm

    Many fascinating comments! You are saying that in the relatively uncommon scenario when an ID really won't ever possibly change no way that it is better to not hide the value. Jeff, you also write 'So mainly, my beef is with people that have taken the "no literals" rule literally and have used constants everywhere just to tick that box, without actually making the code any more maintainable.'It's hard to argue with people who argue against mindlessly following any rule. In fact, we should never follow any rule mindlessly. Now there's a rule to follow without exception! 🙂 It seems to me that the biggest concern is impact on performance of using these constants (or calling a function to return the value, which offers more flexibility but also complicates the optimization efforts). We certainly cannot sacrifice performance to an unacceptable degree to improve maintainability. I say "unacceptable" because we should not be concerned in the abstract about performance; surely that named constant or function reference in a query will not always mean severe degradation of performance.As for duplication of values on the front end – well, I would extend the same principle there: the front-end would call the backend function to retrieve the magic value, stick it in its own global or whatever mechanism makes sense "up there", and then reference the global. The magic value is not, then, duplicated on the front end.So my take away is: yes, there are unusual circumstances that might warrant repeated hard-codings, but they are rare and the main thing is to encourage developers to avoid the repeated hard codings.Regards,Stevenwww.stevenfeuerstein.comwww.plsqlchallenge.com

  8. I was going to summarise what we've been saying here but Steven, you've done it nicely for me :)My intention was to question the idea (that some people seem to have) that ALL literals MUST be made into constants; which in my personal experience I've seen particularly misapplied in two different systems. Perhaps two is not a significant sample – maybe I tend to find myself on sites with poorly designed systems!Anyway, when I'm designing a new system, the tendency is for relatively few literals anyway – a manageable number for which constants are quite practical; and on the UI side, even fewer – wherein it is simple for the client to make a database call to initialise the constants, as Steven suggests.

  9. Connor McDonald
    24 March 2010 - 6:59 am

    Here's one from left field….Use PLSQL_CCFLAGS so that your source code has "globals" but your compiled code has "literals". 🙂

Leave a Reply

Your email address will not be published / Required fields are marked *