Tuesday, August 28, 2012

Informix 11.70.xC2: It's out!

This article is written in English and Portuguese
Este artigo está escrito em Inglês e Português

English Version:

It's been too long since my last post. The reason is the usual lack of time. Sorry for that...
This is just a small post to let you know that Informix version 11.70.xC2 is being made available. It's already possible to get if from the Fix Central site (link on the right). The Information Center documentation site is still not updated. But a quick look into the release notes shows some interesting stuff:
  • Installation without root privileges
    This makes it possible to install and use Informix without root privileges. But due to the Unix/Linux nature, some features may not be available. This was a request from some embedded solutions providers. So it's another feature that helps make Informix the right choice for those environments. I believe we'll see more developments in the usage without root privileges in the future.
  • More SQL Admin API commands
    This time will see things like CREATE/DROP DATABASE and ONTAPE/ONBAR/ONSMSYNC commands. This is great since you can trigger your backups as tasks. Something that we're missing.
  • More improvements in the BTS text search datablade. IBM continues to improve this datablade which it free of charges
  • Table and column ALIASES in DML instructions (SELECT, DELETE and UPDATE)
    The ALIAS can now be used in GROUP BY clauses. I like this one. We could use numbers, but if you change your projection list you also need to change the GROUP BY clause.
  • Case insensitive searches
    This was also a frequently asked for feature. It applies only to NCHAR and NVARCHAR fields and you need to specify it in the CREATE DATABASE statement
  • OAT improvements
    As usual, when we see engine improvements we also see them in OAT. This time, a new area lets you manage your backups. Other features include the ability to uninstall a plug-in (something I also missed), ability to create reports based on historical data, improvements in the schema manager plugin-in and a few more
  • Ability to configure the number of file descriptor servers
    This is an intriguing feature related to a nasty problem that affects Informix instances with very intensive usage (I'm talking about thousands of concurrent connections, and a very high rate of new connections per second - typical values lay in the vicinity of more than 2000 concurrent sessions and/or more than 15 new connections per second, but it really depends on the environment). This issue is worth a dedicated article, and tech support usually knows it by "nsf.lock issue". If you never heard about it, it's because you don't suffer from it. In any case, this feature is in fact present in several older versions (later v10 fixpacks and 11.50). Unfortunately it was not properly documented. Also note that v11.7 has some structural changes that should eliminate this problem. The feature is translated into a new parameter called NUMFDSERVERS. Classical versions (pre v10.??) used just one. Somewhere in the v10 family it was decided that more was better, but sometimes it isn't due to other points of contention (eliminated in v11.7). So now you can decide how many to use.
  • Informix Warehouse Accelerator
    This is a new product that uses new technology. It is composed of a new in-memory based query engine, and a tool you use to map your OLTP data into that new system. Then, when you send DSS like queries to the OLTP engine, it will decide if the "partner" system can handle them. If it does the query is routed transparently and the results sent back. If it doesn't than the OLTP will resolve the queries. The advantage is that you get much (really!) faster query times on the queries routed to the new system. There was a preview of these technology on IIUG conference last year and the results were impressive. Please be alert, because there will be some buzz around this (the same technology is already available on DB2 for z/OS)
In short that's it. And there as with any other fixpack there are some bug fixes.


Versão Portuguesa:


Passou muito tempo desde o último artigo. A razão é a habitual falta de tempo. As minhas desculpas...
Este artigo serve apenas para dar conta de que a versão 11.7.xC2 do Informix está a ser disponibilizada. Já é possível obtê-la do site Fix Central (ligação à direita). A documentação no Information Center ainda estará a ser actualizada. Mas uma pequena espreitadela nas release notes mostra algumas coisas interessantes:
  • Instalação sem privilégios de root.
    Isto torna possível instalar e utilizar o Informix sem privilégios de root. Mas dada a natureza dos sistemas operativos Unix/Linux, algumas funcionalidades poderão não estar disponíveis. Isto foi um pedido de alguns fornecedores de soluções embebidas. Por isso é mais uma funcionalidade que ajuda o Informix a ser a escolha acertada para este tipo de ambientes. Acredito que iremos ver mais desenvolvimentos relativos à utilização sem root no futuro
  • Mais comandos da API de administração SQL
    Desta vez vemos comandos como CREATE/DROP DATABASE e ONTAPE/ONBAR/ONSMSYNC. Isto é óptimo pois passamos a poder despoletar backups como tarefas. Algo que já se sentia falta
  • Mais melhorias no datablade de pesquisa de texto livre (BTS)
    A IBM continua a melhorar este datablade que é distribuído sem custos com o produto
  • ALIAS em tabelas e colunas nas instruções de DML ((SELECT, DELETE and UPDATE)
    Os ALIAS podem agora ser usados nas cláusulas de GROUP BY. Pessoalmente agrada-me bastante. Já podíamos usar números, mas se mudássemos a projection list teríamos também de arranjar a cláusula GROUP BY
  • Pesquisas por caracteres não sensíveis a maiúsculas ou minúsculas
    Esta funcionalidade fazia parte da lista com mais pedidos. Só se aplica a colunas NCHAR e NVARCHAR e tem de ser especificada na criação da base de dados (instrução CREATE DATABASE)
  • Melhorias no OAT
    Como vem sendo hábito, sempre que temos melhorias no motor também as vemos no OAT. Desta vez, uma nova área permite gerir os backups. Outras novidades inclúem a possibilidade de desinstalar plug-ins (algo que sentia falta), criação de relatórios baseados em dados de histórico, melhorias no plug-in de gestão de schema e mais alguns
  • Possibilidade de configurar o número de servidores de file descriptors.
    Isto é uma funcionalidade algo intrigante, relacionada com um problema complexo que afecta instâncias Informix com uma utilização muito intensiva (estou a falar de milhares de sessões concorrentes e uma taxa muito alta de novas sessões por segundo - valores tipicos situam-se perto de mais de 2000 sessões concorrentes e/ou mais de 15 novas sessões por segundo, mas dependerá sempre de cada ambiente)
    Este assunto mereceria por si só um artigo, mas o suporte técnico reconhece-o por "problema do nsf.lock". Se nunca ouviu falar nele é porque nunca sofreu com ele. Em qualquer caso, esta funcionalidade está de facto presente em várias versões já antigas (últimos fixpacks da versão 10 e fixpacks da versão 11.50). Infelizmente não estava devidamente documentadas. Note-se também que a versão 11.7 tem algumas modificações estruturuais que deverão eliminar este problema. A funcionalidade traduz-se num novo parâmetro chamado NUMFDSERVERS. Versões antigas (pre v10.??) usavam apenas um servidor de file descriptors. Num determinado fixpack da versão 10 considerou-se que mais era melhora, mas em alguns casos não é, devido a outros pontos de contenção (eliminados na versão 11.7). Assim, agora podemos decidir e ajustar quantos queremos
  • Informix Warehouse Accelerator
    Isto é um novo produto que utiliza tecnologia nova. É composto por um motor de queries, baseado em memória, e uma ferramenta que pode usar-se para mapear alguns dados do sistema OLTP neste novo sistema. Depois, quando enviados uma query do tipo DSS ao motor OLTP, ele decide se o novo sistema associado pode resolver a query. Se sim, a query é enviada transparentemente ao novo sistema, e os resultados são enviados de volta. Se a query não puder ser processada pelo sistema "emparceirado", então o sistema OLTP irá resolvê-la. A vantagem é que obteremos muito (mesmo muito!) melhores tempos de execução nas querys enviadas ao novo sistema. Houve uma antevisão do sistema na última conferência do IIUG e os resultados eram realmente impressionantes. Mantenha-se alerta, pois isto irá certamente dar que falar nos próximos tempos (a mesma tecnologia já existe para DB2 em z/OS)
Em resumo é isto. E como em qualquer outro fixpack contém igualmente um número de correcções.

Monday, August 27, 2012

Panther: Extending extents / Estendendo os extents

This article is written in English and Portuguese
Este artigo está escrito em Inglês e Português

English version:

Back to Panther... Although I'm not in the video on the right, I do love Informix. That doesn't mean I ignore some issues it has (or had in this particular case). One thing that always worries an Informix DBA is the number of extents in his tables. Why? Because it used to have a maximum amount, and because that maximum was pretty low (if compared with other databases). But what is an extent? In short an extent is a sequence of contiguous pages (or blocks) that belong to a specific table or table partition. A partition (a table has one or more partitions) has one or more extents.
Before I go on, and to give you some comparison I'd like to tell you about some feedback I had over the years from a DBA (mostly Oracle, but who also managed an Informix instance):
  • A few years ago he spent lots of time running processes to reduce the number of extents in a competitor database that had reach an incredible number of extents (around hundreds of thousands). This was caused by real large tables and real badly defined storage allocation
  • Recently I informed the same customer that Informix was eliminating the extent limits and the same guy told me that he was afraid that it could lead to situations as above. He added, and I quote, "I've always admired the way Informix deals with extents"
So, if a customer DBA is telling that he admires the way we used to handle extents, and he's afraid of this latest change, why am I complaining about the past? As in many other situations, things aren't exactly black and white... Let's see what Informix have done well since ever:

  1. After 16 allocations of new extents Informix automatically doubles the size of the next extent for the table. This decreases the number of times it will try to allocate a new extent. So, using only this rule (which is not correct as we shall see) if you create a table with a next size of 16K, you would reach 4GB with around 225 extents.
  2. If Informix can allocate a new extent contiguous to an already existing one (from the same table of course) than it will not create a new one, but instead it will extend the one that already exists (so it does not increase the number of extents). This is one reason why rule number one may not be seen in practice. In other words, it's more than probable that you can reach 4GB with less than 225 extents.
  3. In version 11.50 if I'm correct, a fix was implemented to prevent excessive extent doubling (rule 1). If the next extent size is X and the dbspace only has a maximum of Y (Y < X) informix will allocate Y and will not raise any error.
    If this happens many times, we could end up having a number of allocated pages relatively small, but a next extent size too big. There's a real problem in this: If in these circumstances we create another chunk in the same dbspace, and after that our table requires another extent, the engine could reserve a large part of the new (and possibly still empty) chunk to our table. This can be much more than the size already allocated to the table. To avoid this, extent doubling will only happen when there is a reasonable relation between the new calculated next extent size and the space effectively allocated to the table.
  4. Extent description in Informix have never been stored in the database catalog. This leads to simpler and efficient space management. Compared to other databases that used to do the management in the catalog, they tend to hit issues in the SQL layer and at the same time these were slower. One of our competitors changed that in their later versions, and DBAs have seen improvement with that (but they had to convert). Informix always worked the better way...
So, these are the good points. Again, why was I complaining? Simply because although Informix has done a pretty good job in preventing the number of extents to grow too much, we had a very low limit for the number of extents. In platforms with a system page size of 2K this was around 220-240 extents (max), and for 4K platforms is was around the double of that (440-480). With version 10 we started to be able to have greater page sizes, and that increases the maximum number of extents per partition.
Why didn't we have a fix limit and why was it different in several platforms? In order to explain that we must dive a bit deeper in the structure of a dbspace and tables. Each partition in Informix has a partition header. Each partition header is stored in exactly one Informix page. There is a special partition in every dbspace (tablespace tablespace) that holds every partition headers from that dbspace. This special partition starts in a specific page but then, it can contain more than one extent.
Also important to understand this is the notion of slot. Most Informix pages contain several slots. For a data page, a slot contains a row (in the simplest cases). A partition header is a page that contains 5 slots:
  1. Slot 1
    This contains the partition structure (things like creation date, partition flags, maximum row size, number of special columns - VARCHAR and BLOB -, number of keys - if it's an index or has index pages -, number of extents and a lot of other stuff. If you want to see all the details check the sysptnhdr table in $INFORMIXDIR/etc/sysmaster.sql. It's basically an SQL interface for the partition headers in the instance.
    In version 11.50 this slot should occupy 100 bytes. Previous versions can have less (no serial 8, and bigserial)
  2. Slot 2
    Contains the database name, the partition owner, the table name and the NLS collation sequence
  3. Slot 3
    Contains details about the special columns. If there are no special columns this slot will be empty
  4. Slot 4
    Contains the description for each key (if it's an index or a mix). Starting with version 9.40, by default the indexes are stored in their own partitions. This was not the case in previous versions. A single partition could contain index pages interleaved with data pages.
    Currently, by default, a partition used for data should not have any key, so this slot will be empty
  5. Slot 5
    Finally, this is the slot that contains the list of extents.
Now we know the list of extents must be stored in the partition header. And the partition header has 5 slots, and the size of first four may vary. This means that the free space for slot 5 (extent list) is variable. These are the reasons why we had a limit and why that limit was not fixed. It would vary with the table structure for example. And naturally it would vary with the page size.
A table that reached it's maximum number of extents was a very real and serious problem in Informix. If you reach the table limit for number of extents and all your table's data pages are full, the engine would need to allocate one more extent in order to complete new INSERTs. But for that it would require some free space in the partition header. If there was no space left, any INSERT would fail with error -136:

-136 ISAM error: no more extents.

After hitting this nasty situation there were several ways to solve it, but all of them would need temporary table unavailability, which in our days is rare... We tend to use 24x7 systems. Even systems that have a maintenance window would suffer with this, because most of the time the problem was noticed during "regular" hours...

So, I've been talking in the past... This used to be a problem. Why isn't it a problem anymore? Because Panther (v11.7) introduced two great features:
  1. The first one is that it is able to automatically extend the partition header when slot 5 (the extent list) becomes full. When this happens it will allocate a new page for the partition header that will be used for the extent list. So you should not see error -136 caused by reaching the extent limit. At first you may think like my customer DBA: "wow! Isn't that dangerous? Will I get tables/partitions with tens of thousands of extents?". The answer is simple. You won't get that high number of extents because all the nice features that were always there (automatic extent concatenation, extent doubling...) are still there. This will just avoid the critical situation where the use of the table would become impossible (new INSERTs). And naturally it doesn't mean that you should not care about the number of extents. For performance reasons it's better to keep them low
  2. The second great feature is an online table defragmenter. They can grow forever, but that's not good. Once you notice you have a table with too many extents you can ask the engine to defragment it. I will not dig into this simply because someone else already did it. I recommend you check the recent DeveloperWorks article entitled "Understand the Informix Server V11.7 defragmenter"

Versão Portuguesa:

De volta à Panther... Apesar de não estar no vídeo à direita, eu adoro o Informix. Isso não significa que ignore alguns problemas que ele tem (ou tinha neste caso particular). Uma coisa que preocupa qualquer DBA Informix é o número de extents das suas tabelas. Porquê? Porque esse número tinha um máximo, e porque esse máximo era bastante baixo (se comparado com outras bases de dados). Mas o que é um extent? De forma simples, um extent é uma sequência contígua de páginas (ou blocos) que pertencem a uma tabela ou partição de tabela. Uma partição (uma tabela tem uma ou mais partições) tem um ou mais extents.
Antes de continuar, e para estabelecer uma comparação, gostaria de transmitir algum feedback que ao longo de anos tive de um DBA (essencialmente Oracle, mas também geria uma instância Informix):

  • Há alguns anos atrás passou bastante tempo a executar processos para reduzir o número de extents de uma base de dados concorrente do Informix. Essa base de dados tinha tabelas que atingiram um número incrível de extents (na casa das centenas de milhar). Isto foi causado por tabelas verdadeiramente grandes e cláusulas de alocação de espaço realmente mal definidas
  • Recentemente informei esse mesmo cliente que o Informix ia eliminar o limite de extents, e a mesma pessoa disse-me que tinha receio que isso pudesse levar a situações como a de cima. Ele acrescentou, e cito: "Se há coisa que sempre admirei foi a maneira como o Informix gere os extents".
Assim sendo, se um DBA de um cliente diz que admira a maneira como geríamos os extents e ele próprio receia a eliminação de limites, porque é que eu me queixo do passado? Como em muitas outras situações, as coisas não são bem a preto e branco... Vejamos o que o Informix sempre fez bem:

  1. Após cada 16 novos extents adicionados, o Informix automaticamente duplica o tamanho do próximo extent da tabela. Isto diminui o número de vezes que tenta reservar um novo extent. Usando apenas esta regra (o que não é correcto como veremos a seguir), se criar uma tabela com o extent mínimo (16KB), a tabela pode crescer até aos 4GB com cerca de 225 extents.
  2. Se o Informix conseguir reservar um novo extent que seja contíguo a um que já esteja alocado à mesma tabela, então em vez de criar um novo, vai alargar o já existente (portanto não aumenta o número de extents). Esta é a razão porque a regra anterior pode não ser verificada na práctica. Por outras palavaras é mais que provável que consiga atingir os 4GB com menos de 225 extents.
  3. Salvo algum erro, na versão 11.50 foi introduzida uma melhoria para prevenir a duplicação excessiva do tamanho do próximo extent (regra 1). Se o tamanho do próximo extent for X, mas o dbspace só tiver um máximo de Y espaço livre contíguo (Y < X) o Informix vai criá-lo com o tamanho Y e nem se queixará de nada. Se isto acontecer muitas vezes, podemos acabar por ter um número de páginas efectivas de uma tabela ou partição relativamente pequeno e um tamanho para o próximo extent muito grande. Existe um problema real nisto: Se nessas cirunstâncias for criado um novo chunk nesse dbspace, e a tabela precisar de novo extent, pode acontecer que o motor reserve grande parte, ou mesmo a totalidade do novo chunk para a tabela (possivelmente muito mais que o tamanho já reservado até então). Para evitar isto, a duplicação do tamanho do próximo extent só acontece quando o novo tamanho tem uma relação razoável com o espaço reservado até então. Caso contrário o tamanho do próximo extent a alocar não é duplicado.
  4. A informação dos extents em Informix nunca foi guardada nas tabelas de catálogo. Isto faz com que a sua gestão seja mais simples e eficiente. Comparada com outras bases de dados que faziam a gestão no catálogo, estas tendiam a encontrar problemas e constrangimentos próprios da camada de SQL, e ao mesmo tempo eram mais lentas. Um dos concorrentes mudou isto há umas versões atrás e os seus utilizadores viram benefícios bem notórios (mas tiveram de converter). O Informix sempre trabalhou da melhor forma...
Estes são os pontos positivos. Mais uma vez, porque é que me estava a queixar? Simplesmente porque apesar de o Informix sempre ter feito um trabalho extraordinário na prevenção contra um elevado número de extents, nós tinhamos um limite, e era muito baixo. Em plataformas com um tamanho de página de sistema de 2KB este limite rondava os 220-240 extents, e em plataformas de 4KB o limite era cerca do dobro (440-480). Com a versão 10 pudemos passar a ter páginas maiores, e isso aumenta o limite.
Porque é que o limite não é fixo, e porque é diferente consoante a plataforma? Para explicar isto temos de nos debruçar de forma mais detalhada na estrutura física de um dbspace e tabela. Cada partição em Informix tem um cabeçalho. Cada cabeçalho de partição é guardado numa página Informix. Existe uma partição especial em cada dbspace (designada habitualmente por tablespace tablespace) que guarda todos os cabeçalhos das partições criadas nesse dbspace. Esta partição especial começa numa página específica do primeiro chunk do dbspace, mas pode ter mais que um extent.
Igualmente importante para compreender isto é a noção de slot. A maioria das páginas Informix estão divididas em slots. Para uma página de dados um slot contém uma linha de dados da tabela (caso mais simples). Um cabeçalho de partição é uma página que contém cinco slots:

  1. Slot 1
    Este contém a estrutura da partição (coisas como a data de criação, flags, tamanho máximo de uma linha, numéro de colunas ditas especiais - VARCHAR e BLOBs -, número de chaves - se for um indíce ou tiver páginas de indíce -, número de extents e uma série de outros detalhes. Se tiver curiosidade em saber o que lá está guardado consulte a tabela sysptnhdr na base de dados sysmaster (ou o ficheiro $INFORMIXDIR/etc/sysmaster.sql). Basicamente esta tabela é um interface SQL sobre todos os cabeçalhos de partição da instância Informix.
    Na versão 11.50 este slot ocupa 100 bytes. Versões anteriores podem ocupar menos (ausência do serial8 e bigserial)
  2. Slot 2
    Contém o nome da base de dados, dono da partição, nome da tabela e a NLS collation sequence
  3. Slot 3
    Contém detalhes sobre todas as colunas especiais. Se não existirem colunas especiais (VARCHAR e BLOB) este slot estará vazio. Se existirem, o tamanho ocupado dependerá da estrutura da tabela.
  4. Slot 4
    Contém a descrição de cada chave (se for um índice ou um mix). Desde a versão 9.40, por omissão os indíces são guardados em partição à parte. Isto não era assim em versões anteriores. Uma partição podia conter páginas de indíces e de dados.
    Actualmente, por omissão, uma partição usada para dados não deve ter nenhuma chave, e assim este slot deve estar vazio
  5. Slot 5
    Finalmente, este é o slot que contém a lista dos extents.

Agora sabemos que a lista de estents tem de ser guardada no cabeçalho da partição. E este contém cinco slots sendo que o tamanho dos primeiros quatro varia. Isto implica que o espaço livre para o slot cinco (a lista de extents) é variável. Estas são as razões porque tinhamos um limite e porque esse limite era variável. Variava por exemplo com a estrutura da tabela. E naturalmente variava com o tamanho da página.
Uma tabela que atingisse o número máximo de extents tornava-se num problema sério em Informix. Quando tal acontece, se todas as páginas de dados estiverem cheias, o motor terá de reservar um novo extent para completar novos INSERTs. Mas para isso necessitaria de espaço livre no cabeçalho da partição. Portanto, não havendo aí espaço livre todos os INSERTs falhariam com o erro -136:

-136 ISAM error: no more extents.

Depois de batermos nesta situação havia várias formas de a resolver, mas todas elas necessitavam de indisponibilidade temporária da tabela, o que nos dias que correm é um bem raro... A tendência é usarmos sistemas 24x7. Mesmo sistemas que tenham janela de manutenção sofreriam com isto, porque na maioria das vezes o problema manifestava-se durante o horário normal ou produtivo...

Bom, mas tenho estado a falar no passado.... Isto costumava ser um problema. Porque é que já não o é? Porque a versão 11.7 (Panther) introduziu duas excelentes funcionalidades:

  1. A primeira é que a partir de agora é possível estender automaticamente o cabeçalho da partição quando o slot cinco enche. Nesta situação, uma nova página é reservada para o cabeçalho da partição e a lista de extents pode crescer. Portanto não deverá voltar a ver o erro -136 causado por atingir o limite de extents. À primeira vista pode ter a mesma reacção que o DBA do meu cliente. "Epa! Mas isso não é perigoso? Vou passar a ter tabelas/partições com dezenas de milhares de extents?". A resposta é simples. Não vai atingir esses números de extents porque todas as boas características que sempre existiram (junção automática de extents, duplicação de tamanho do próximo extent...) ainda estão presentes e funcionais. Isto apenas evitará a situação crítica em que o uso da tabela se tornava impossível (para novos INSERTs). E naturalmente isto não significa que passemos a ignorar o número de extents das nossas tabelas. Por questões de desempenho é melhor mantê-los baixos.
  2. A segunda grande funcionalidade é um desfragmentador online de tabelas ou partições. O número de extents pode crescer indefinidamente, mas isso não é bom. Assim que notar que tem uma partição com um número elevado de extents pode pedir ao motor que a desfragmente. Não vou aprofundar este tema, simplemente porque já foi feito. Recomendo que consulte um artigo recente do DeveloperWorks intitulado "Understand the Informix Server V11.7 defragmenter". Infelizmente o artigo só tem versão em Inglês

Sunday, August 26, 2012

Novas edições de Informix: Saldos? [Verifique o novo artigo e nota final]

Hi. In the last IIUG conference I had the chance of talking to some Brazilian members of the Informix community and they gave me the idea that there is an high demand for Portuguese content. When I started the blog I decided to do it in English (and that was not an easy decision because I'm not that confident about my English) for two main reasons:
  • I wanted to reach a wider audience
  • I believe most Portuguese IT workers are used to read English, so it would not be hard for them to follow this (ignoring my English mistakes of course)
The last reason probably encloses two mistakes. The main one is that the Portuguese speaking audience is much bigger that Portugal (for those who don't know Portugal has around 10M people. Brazil is probably 20+ times this and then there are the African countries like Angola, Mozambique, Cabo Verde etc.). The other mistake may be that too many people in Portugal may care about this and may not want to read stuff in English. All this intro serves to say that I'll probably be doing some articles in Portuguese. I really haven't decided yet on how to do it (same article with two languages, or repeat the article. or create another blog...? For now and because I think this is a very important subject, I'll continue this article in Portguese to talk about the new Informix Editions.

----- Portuguese from here on.... -----------------------

Introdução:

Bom, para quem teve o trabalho de ler a introdução acima em Inglês por favor ignore este paràgrafo. Para os restantes basicamente a introdução serve para explicar que apesar de a lingua do blog ser o Inglês é provável que comece a publicar alguma informação em Português. A razão porque escolhi o Inglês para escrever o blog foi porque tenho a ideia que em Portugal a maioria das pessoas que trabalham em informática não se importam de ler Inglês e assim consegui uma audiência maior. Na conferência de utilizadores deste ano tive oportunidade de trocar impressões com membros Brasileiros da comunidade Informix e eles deram-me a ideia que há muita gente para quem este assunto não é indiferente e portanto existe muita gente que por um motivo ou outro preferem ou restringem-se aos conteúdos em Português. Assim, e porque o tema do último artigo é realmente muito importante, tomo-o como o ponto de partida para algumas entradas em Português no blog.

Novas edições do Informix

Como já terão tido oportunidade de ler ou ouvido comentar, a partir do dia 25 de maio de 2010 a IBM reformulou a oferta das várias edições do Informix.
Portanto algumas das edições a que estávamos habituados deixaram de estar disponíveis. Nomeadamente a Enterprise Edition e a Workgroup Edition foram descontinuadas. Como susbtituição foram introduzidas respectivamente a Ultimate Edition e a Growth Edition (ah... os nomes...). De forma muito resumida, a Ultimate Edition inclui tudo o que o Informix tem para oferecer com excepção da Storage Optimization Feature (compressão), e a Growth Edition exclui particionamento, funcionalidades de paralelismo, e compressão (inclui Enterprise Replication e clustering - até dois nós secundários em modo leitura/escrita - ) e está limitada a 4 sockets ou 16 cores e 16GB de RAM (soma do total de memória atribuída ao Informix em cada instalação).

Mas as grandes novidades são a introdução de duas novas edições: Innovator-C (disponível para todas as plataformas) e a Ultimate-C para Windows e MacOS. Mas o que têm de não diferente estas edições "-C"?:
  • Pode fazer o download, desenvolver e colocar em produção sem custos de licenciamento
  • Pode adquirir suporte
  • Apesar de terem limites nos recursos que podem utilizar, mas esses limites são razoáveis (certamente haverá opiniões contrárias)
Vamos examiná-las com mais detalhe. "-C" significa "Community". Comecemos pela Innovator-C:
  • Download livre e sem custos
  • Pode ser usada para desenvolvimento sem custos
  • Pode ser utilizada em produção
  • Está disponível para todas as plataformas suportadas pelo Informix
  • Limitada a 2GB de RAM (soma de toda a memória atribuída ao Informix por cada instalação), 1 socket ou 4 cores, sem limites de espaço usado em disco
  • Dois nós de Enterprise Replication
  • HDR (1 nó secundário em modo de leitura/escrita)
  • Funcionalidades não disponíveis: Compressão, Continuous Availability Feature ( CAF - shared disk secondaries - nós secundários com discos partilhados), particionamento, funcionalidades de paralelismo, Advanced Access Control (LBAC), Informix Warehouse e múltiplos nós secundários, encriptação de colunas, queries distríbuídas (I-Star) e outras funcionalidades (detalhes na licença)
  • Suporte opcional
E agora a Ultimate-C Edition para Windows e MacOS:
  • Download livre e sem custos
  • Pode ser usada para desenvolvimento sem custos
  • Pode ser utilizada em produção
  • Está disponível apenas para Windows e MacOS
  • Limitada a 16GB de RAM (soma de toda a memória atribuída ao Informix por cada instalação), 4 sockets ou 16 cores, sem limites de espaço usado em disco
  • Enterprise Replication totalmente funcional
  • HDR (1 nó secundário em modo de leitura/escrita)
  • Particionamento
  • Paralelismo
  • Nós secundários em modo leitura/escrita
  • Warehouse Feature (ETL)
  • Advanced Access Control (LBAC)
  • Informix Warehouse Tool (SQW)
  • Funcionalidades não disponíveis: Compressão, Continuous Availability Feature ( CAF - shared disk secondaries - nós secundários com discos partilhados)
  • Suporte opcional
NOTA: Estas edições não podem ser redistribuidas sem um acordo prévio com a IBM.

Outras edições permanecem como existiam: Developer Edition e Express Edition.
Portanto agora temos uma base de dados gratuita com algumas limitações, mas que se aplica a muitos cenários. Significará isto que a IBM perderá vendas? Não necessariamente. É claro que pode comprar-se suporte. Quem implementar soluções criticas sobre estas edições de gratuito irá provavelmente desejar ter suporte. Por outro lado isto deverá aumentar a penetração e reconhecimento do Informix no mercado. Estas edições poderão ser o par perfeito para a iniciativa de open source. Poderia falar de várias situações que levaram algumas empresas e usar mySQL ou Postgres simplesmente por causa do custo. Muitos destes cenários poderiam enquadrar-se nas possíveis utilizações destas versões. Isto torna a iniciativa de Open Source ainda mais relevante agora. As melhorias no Hibernate são um excelente sinal e depois de uma troca de impressões com um parceiro local penso que outros projectos Open Source deveriam receber atenção da iniciativa. Felizmente muitos deles já se encontram listados no website da iniciativa.

Outra boa melhoria (deveria chamar-lhe uma correcção) efectuada foi ao nível da usabilidade no website do Informix. Se aceder a http://ibm.com/software/data/informix, ou simplesmente http://ibm.com/informix ou até http://www.informix.com, irá ser redireccionado para uma página com uma ligação para "downloads". A partir daí pode navegar facilmente numa lista de downloads disponíveis.

Onde pode obter mais informação sobre este tema?:
Estas mudanças passa a ser efectivas na versão 11.50.xC7 fixpack.


NOTA [ 22 Julho de 2010]: Este artigo está desactualizado! As edições Ultimate-C para Windows e Mac foram retiradas. A versão Innovator-C ficará disponível para todas as plataformas e surge uma nova edição (Choice), com custos de licenciamento menores que a Growth Edition e com limites que se situam entre a Innovator-C e a Growth Edition.
Artigo com as últimas alterações:

http://informix-technology.blogspot.com/2010/07/informix-editions-revisited-versoes.html


Cumprimentos

Sunday, August 19, 2012

Panther: oninit -i ... ups... too late... or not / Panther: oninit -i ... ups... tarde de mais... ou não

This article is written in English and Portuguese
Este artigo está escrito em Inglês e Português

English version:

I hope that this one will be quick... How many of us have tried to initialize (oninit -i) an already initialized instance by mistake? Personally I don't think I did it, but our mind tends to erase bad experiences :) But we have heard too many stories like this. A problem in the environment setup and this can easily happen.
Well, the good folks from R&D tried to keep us safe from ourselves by introducing a new parameter called FULL_DISK_INIT. It's something that magically appears in the $ONCONFIG file with the value of 0, or that simply is not there... It's absence, or the value 0, means that if you run oninit -i and there is already an informix page in the rootdbs chunk, it will fail. Let's see an example:


panther@pacman.onlinedomus.net:fnunes-> onstat -V
IBM Informix Dynamic Server Version 11.70.UC1 Software Serial Number AAA#B000000
panther@pacman.onlinedomus.net:fnunes-> onstat -
shared memory not initialized for INFORMIXSERVER 'panther'
panther@pacman.onlinedomus.net:fnunes-> onstat -c | grep FULL_DISK_INIT
FULL_DISK_INIT 0
panther@pacman.onlinedomus.net:fnunes-> oninit -i

This action will initialize IBM Informix Dynamic Server;
any existing IBM Informix Dynamic Server databases will NOT be accessible -
Do you wish to continue (y/n)? y

WARNING: server initialization failed, or possibly timed out (if -w was used).
Check the message log, online.log, for errors.
panther@pacman.onlinedomus.net:fnunes-> onstat -m
shared memory not initialized for INFORMIXSERVER 'panther'

Message Log File: /usr/informix/logs/panther.log
Wed Oct 20 00:02:10 2010

00:02:10 Warning: ONCONFIG dump directory (DUMPDIR) '/usr/informix/dumps' has insecure permissions
00:02:10 Event alarms enabled. ALARMPROG = '/home/informix/etc/alarm.sh'
00:02:13 Booting Language from module <>
00:02:13 Loading Module
00:02:13 Booting Language from module <>
00:02:13 Loading Module
00:02:19 DR: DRAUTO is 0 (Off)
00:02:19 DR: ENCRYPT_HDR is 0 (HDR encryption Disabled)
00:02:19 Event notification facility epoll enabled.
00:02:19 IBM Informix Dynamic Server Version 11.70.UC1 Software Serial Number AAA#B000000
00:02:20 DISK INITIALIZATION ABORTED: potential instance overwrite detected.
To disable this check, set FULL_DISK_INIT to 1 in your config file and retry.

00:02:20 oninit: Fatal error in shared memory initialization

00:02:20 IBM Informix Dynamic Server Stopped.

00:02:20 mt_shm_remove: WARNING: may not have removed all/correct segments

Very nice. It didn't allow me to shoot myself in the foot.
And if we don't have it in the $ONCONFIG?:


panther@pacman.onlinedomus.net:fnunes-> vi $INFORMIXDIR/etc/$ONCONFIG
panther@pacman.onlinedomus.net:fnunes-> onstat -
shared memory not initialized for INFORMIXSERVER 'panther'
panther@pacman.onlinedomus.net:fnunes-> onstat -c | grep FULL_DISK_INIT
#FULL_DISK_INIT 0
panther@pacman.onlinedomus.net:fnunes-> oninit -i

This action will initialize IBM Informix Dynamic Server;
any existing IBM Informix Dynamic Server databases will NOT be accessible -
Do you wish to continue (y/n)? y

WARNING: server initialization failed, or possibly timed out (if -w was used).
Check the message log, online.log, for errors.
panther@pacman.onlinedomus.net:fnunes-> onstat -m
shared memory not initialized for INFORMIXSERVER 'panther'

Message Log File: /usr/informix/logs/panther.log
The default memory page size will be used.
00:06:43 Segment locked: addr=0x44000000, size=224858112

Wed Oct 20 00:06:44 2010

00:06:44 Warning: ONCONFIG dump directory (DUMPDIR) '/usr/informix/dumps' has insecure permissions
00:06:44 Event alarms enabled. ALARMPROG = '/home/informix/etc/alarm.sh'
00:06:44 Booting Language from module <>
00:06:44 Loading Module
00:06:44 Booting Language from module <>
00:06:44 Loading Module
00:06:50 DR: DRAUTO is 0 (Off)
00:06:50 DR: ENCRYPT_HDR is 0 (HDR encryption Disabled)
00:06:50 Event notification facility epoll enabled.
00:06:50 IBM Informix Dynamic Server Version 11.70.UC1 Software Serial Number AAA#B000000
00:06:52 DISK INITIALIZATION ABORTED: potential instance overwrite detected.
To disable this check, set FULL_DISK_INIT to 1 in your config file and retry.

00:06:52 oninit: Fatal error in shared memory initialization

panther@pacman.onlinedomus.net:fnunes->

The same. So if I'm trying to configure a second instance and I point the ROOTPATH to an existing one I'm safe.... But this raises one question: How can I really re-initialize an instance? I know what I'm doing, so let me work!... It's simple... If you really know what you're doing, set it to 1:


panther@pacman.onlinedomus.net:fnunes-> vi $INFORMIXDIR/etc/$ONCONFIG
panther@pacman.onlinedomus.net:fnunes-> onstat -c | grep FULL_DISK_INIT
FULL_DISK_INIT 1
panther@pacman.onlinedomus.net:fnunes-> oninit -i

This action will initialize IBM Informix Dynamic Server;
any existing IBM Informix Dynamic Server databases will NOT be accessible -
Do you wish to continue (y/n)? y
panther@pacman.onlinedomus.net:fnunes-> onstat -

IBM Informix Dynamic Server Version 11.70.UC1 -- On-Line -- Up 00:00:32 -- 369588 Kbytes

panther@pacman.onlinedomus.net:fnunes-> onstat -c | grep FULL_DISK_INIT
FULL_DISK_INIT 0
panther@pacman.onlinedomus.net:fnunes->

Perfect! It allowed me to initialize it, and immediately changed the FULL_DISK_INIT parameter to 0 to keep me safe again.
This has been in the feature request list for years. Now that it's implemented we should be jumping up and down in plain happiness... But I'm not. Why? Because instead of sending the deserved compliments to R&D for implementing this I want more!
This is terribly useful, and will save a lot of people from destroying their instances. But unfortunately I've seen many other cases of destruction that can't be avoided by this. A few examples:
  1. A chunk allocation for a second instance on the same machine (using RAW devices) overwrites another already used chunk from another instance
  2. A restore of an instance overwrites another (either fully or partially)
  3. A restore of an instance on the same machine using the rename chunks functionality uses an outdated rename chunks file (-rename -f FILE ontape option). This file doesn't have a few chunks that were recently added. So these chunks will be restored over the existing chunks!
So, what would make me jump would be something that covered all these scenarios. It would not be a simple ONCONFIG parameter and a change in oninit. It would require changes in more utilities and server components (onspaces, SQL Admin API, ontape, onbar...), but that would really keep us safe from our mistakes. For now this is a good sign, and if these questions worry you, be alert and if you have the chance make IBM know that it is important to you.

One instance was destroyed to bring this article to you... I'll spend another 30s to get the data back into it :)



Versão Portuguesa:

Espero que este seja rápido... Quantos de nós já tentámos inicializar (oninit -i/iy) uma instância já inicializada por engano? Pessoalmente não me recordo de me ter acontecido, mas a nossa mente tende a apagar episódios traumáticos :) Mas já ouvimos demasiadas estórias como esta. Basta um problema na configuração de um ambiente e isto pode acontecer facilmente.
Bem, os bons rapazes do desenvolvimento tentaram manter-nos a salvo de nós mesmos, através da introdução de um novo parâmetro chamado FULL_DISK_INIT. É algo que aparece magicamente no nosso $ONCONFIG com o valor 0, ou que simplesmente não está lá... A sua ausência ou o valor 0 significam que se tentarmos correr o oninit -i e já existir uma página Informix no nosso chunk do rootdbs irá falhar. Vejamos um exemplo:


panther@pacman.onlinedomus.net:fnunes-> onstat -V
IBM Informix Dynamic Server Version 11.70.UC1 Software Serial Number AAA#B000000
panther@pacman.onlinedomus.net:fnunes-> onstat -
shared memory not initialized for INFORMIXSERVER 'panther'
panther@pacman.onlinedomus.net:fnunes-> onstat -c | grep FULL_DISK_INIT
FULL_DISK_INIT 0
panther@pacman.onlinedomus.net:fnunes-> oninit -i

This action will initialize IBM Informix Dynamic Server;
any existing IBM Informix Dynamic Server databases will NOT be accessible -
Do you wish to continue (y/n)? y

WARNING: server initialization failed, or possibly timed out (if -w was used).
Check the message log, online.log, for errors.
panther@pacman.onlinedomus.net:fnunes-> onstat -m
shared memory not initialized for INFORMIXSERVER 'panther'

Message Log File: /usr/informix/logs/panther.log
Wed Oct 20 00:02:10 2010

00:02:10 Warning: ONCONFIG dump directory (DUMPDIR) '/usr/informix/dumps' has insecure permissions
00:02:10 Event alarms enabled. ALARMPROG = '/home/informix/etc/alarm.sh'
00:02:13 Booting Language from module <>
00:02:13 Loading Module
00:02:13 Booting Language from module <>
00:02:13 Loading Module
00:02:19 DR: DRAUTO is 0 (Off)
00:02:19 DR: ENCRYPT_HDR is 0 (HDR encryption Disabled)
00:02:19 Event notification facility epoll enabled.
00:02:19 IBM Informix Dynamic Server Version 11.70.UC1 Software Serial Number AAA#B000000
00:02:20 DISK INITIALIZATION ABORTED: potential instance overwrite detected.
To disable this check, set FULL_DISK_INIT to 1 in your config file and retry.

00:02:20 oninit: Fatal error in shared memory initialization

00:02:20 IBM Informix Dynamic Server Stopped.

00:02:20 mt_shm_remove: WARNING: may not have removed all/correct segments


Muito bem. Não me deixou dar um tiro no pé.
E se não tivermos o parâmetro no $ONCONFIG?:

panther@pacman.onlinedomus.net:fnunes-> vi $INFORMIXDIR/etc/$ONCONFIG
panther@pacman.onlinedomus.net:fnunes-> onstat -
shared memory not initialized for INFORMIXSERVER 'panther'
panther@pacman.onlinedomus.net:fnunes-> onstat -c | grep FULL_DISK_INIT
#FULL_DISK_INIT 0
panther@pacman.onlinedomus.net:fnunes-> oninit -i

This action will initialize IBM Informix Dynamic Server;
any existing IBM Informix Dynamic Server databases will NOT be accessible -
Do you wish to continue (y/n)? y

WARNING: server initialization failed, or possibly timed out (if -w was used).
Check the message log, online.log, for errors.
panther@pacman.onlinedomus.net:fnunes-> onstat -m
shared memory not initialized for INFORMIXSERVER 'panther'

Message Log File: /usr/informix/logs/panther.log
The default memory page size will be used.
00:06:43 Segment locked: addr=0x44000000, size=224858112

Wed Oct 20 00:06:44 2010

00:06:44 Warning: ONCONFIG dump directory (DUMPDIR) '/usr/informix/dumps' has insecure permissions
00:06:44 Event alarms enabled. ALARMPROG = '/home/informix/etc/alarm.sh'
00:06:44 Booting Language from module <>
00:06:44 Loading Module
00:06:44 Booting Language from module <>
00:06:44 Loading Module
00:06:50 DR: DRAUTO is 0 (Off)
00:06:50 DR: ENCRYPT_HDR is 0 (HDR encryption Disabled)
00:06:50 Event notification facility epoll enabled.
00:06:50 IBM Informix Dynamic Server Version 11.70.UC1 Software Serial Number AAA#B000000
00:06:52 DISK INITIALIZATION ABORTED: potential instance overwrite detected.
To disable this check, set FULL_DISK_INIT to 1 in your config file and retry.

00:06:52 oninit: Fatal error in shared memory initialization

panther@pacman.onlinedomus.net:fnunes->

Acontece o mesmo. Portanto de estiver a tentar configurar uma nova instância e por lapso apontar o ROOTPATH para outra já existente estou salvo... Mas isto levanta uma questao: Como posso re-inicializar uma instância? Eu sei o que estou a fazer, por isso deixem-me trabalhar!... É simples... Se sabe realmente o que está a fazer só tem de o definir para 1:


panther@pacman.onlinedomus.net:fnunes-> vi $INFORMIXDIR/etc/$ONCONFIG
panther@pacman.onlinedomus.net:fnunes-> onstat -c | grep FULL_DISK_INIT
FULL_DISK_INIT 1
panther@pacman.onlinedomus.net:fnunes-> oninit -i

This action will initialize IBM Informix Dynamic Server;
any existing IBM Informix Dynamic Server databases will NOT be accessible -
Do you wish to continue (y/n)? y
panther@pacman.onlinedomus.net:fnunes-> onstat -

IBM Informix Dynamic Server Version 11.70.UC1 -- On-Line -- Up 00:00:32 -- 369588 Kbytes

panther@pacman.onlinedomus.net:fnunes-> onstat -c | grep FULL_DISK_INIT
FULL_DISK_INIT 0
panther@pacman.onlinedomus.net:fnunes->

Perfeito. Deixou-me inicializar e imediatamente mudou o parâmetro FULL_DISK_INIT para 0 para me salvaguardar de novo.

Isto estava na lista de pedidos de coisas a implementar há anos. Agora que está implementado devíamos estar aos saltos de contentamento... Mas eu não estou. Porquê? Porque em vez de enviar os merecidos cumprimentos ao desenvolvimento quero mais!
Isto é tremendamente útil, e vai salvar muita gente de destruir as suas instâncias. Mas infelizmente eu tenho visto muitos outros casos de destruição que não podem ser evitados por isto. Alguns exemplos:
  1. Uma criação de um chunk para uma segunda instância na mesma máquina (usando RAW devices) sobrepõe outro chunk já em uso noutra instância
  2. Uma reposição de um backup sobrepõe outra instância (completa ou parcialmente)
  3. Uma reposição de um backup de uma instância, na mesma máquina, usando a funcionalidade de troca de paths dos chunks usa um ficheiro de rename desactualizado (opção -rename -f FICHEIRO do ontape). Este ficheiro não contém alguns chunks que foram adicionados recentemente. Portanto estes chunks serão restaurados sobre os existentes!
Assim, o que me deixaria aos pulos de contentamento seria algo que cobrisse todos estes cenários. Não seria tão simples quanto uma mudança no ONCONFIG e uma mudança no oninit. Requeriria mudanças em mais utilitários e componentes do servidor (onspaces, SQL Admin API, ontape, onbar....), mas isto sim, conseguiria manter-nos a salvo dos nossos erros. Por agora, esta funcionalidade é um bom sinal, e se estas questões o preocupam, esteja alerta e se tiver oportunidade faça com que a IBM saiba que isto é importante para si.

Uma instância foi destruída para fazer chegar este artigo até a si. Agora vou passar mais 30s a repôr-lhe os dados :)

Saturday, August 11, 2012

Isolation level in WebSphere

This is probably the first post I'll write following a customer facing situation. Although there may be good reasons to don't write about some customer facing situation, I feel this may bring some value to the blog, and we never know when someone having the same problem finds the blog on Google... So this may be the first of several small and direct posts.

Some time ago I had to go into a customer site who was having "major performance issues". After some examination and some talks with the development team I was able to identify several sessions running in Repeatable Read isolation level. The application has several components and one of them is an instance of WebSphere Application Server (WAS) v6.1.

Further investigation allowed us to understand that the application was not using EJBs, nor Session Beans (these allow the isolation level to be specified in the deployment descriptor XML file). As such the database connections were using the WAS default isolation level which is repeatable read.
For those less familiar with repeatable read it's equivalent to ANSI Serializable mode. Basically any record read in order to find the result set is locked and remains locked in shared mode for the duration of the transaction. So an instance which was configured for 20000 locks could reach about one million of them (thanks to the lock table automatic expansion). Obviously this caused a lot of contention between sessions and a lot more internal work for the engine. This was causing their "performance" problems.

The solution was simply to redefine the WAS isolation level for the datasource used by the application. This can be done by using a custom property called webSphereDefaultIsolationLevel which as the name implies can be used to change the database connection default isolation level. Complete information about that can be found in the following documentation:

http://www-01.ibm.com/support/docview.wss?rs=180&uid=swg21224492

In there you can find the property description with the explanation of the values it accepts as well as other ways to change the default isolation level.
After changing this the application behaved properly an most of the performance issues went away. The were some other issues like lack of indexes, and some minor configuration changes on the database side, but those were clearly not the cause of the problems.

Sunday, August 5, 2012

Informix in virtualized environments

I recall that when I was around my 16 to 19 years old I was completely amazed by the possibility of running a different operating system inside a window on my system. At the time I was using a Commodore Amiga, and I had software to emulate Atari, Apple and MS-Dos systems. The first two used the same CPU as my native system, and the later was a complete emulation off an Intel x86 CPU. Because of this, performance was really awful, but nevertheless it was very interesting to use.

At that time we called that emulation. And the purpose was a bit different than what we currently call virtualization. The similarity lies in the fact that in both situations we create a virtual hardware environment in which we run a operating system and applications. Today, virtualization is a widespread technology, used in high-end systems as well as in plain simple laptops. Some examples of virtualization technologies and uses include:

  • IBM's system Z (mainframes)
    These systems have virtualization technology for ages. We can run different operating systems on "partitions" which are groups of resources (CPU, memory, storage) allocated from the base machine. These OS include Linux for example
  • IBM's system P (Power processors)
    It incorporates some of the System Z concepts. The partitions can be "physical" and "logical". Can have a fixed or dynamic resources capacity. Can run AIX and Linux on same base equipment in different "partitions"
  • SUN's Solaris Domains and containers
    On SUN's boxes you can create different partitions running different copies of your operating system, or create "containers" which are logic groups of resources which share the same copy of the operating system. IBM provides the Workload Manager for AIX for this.
  • HP-UX npars, vpars, Integrity VM and Secure Resource Partitions
    HP provides physical partitions, virtual partitions, virtual machines and also virtual resources environments sharing the same copy of the operating system
  • VMware
    Probably the most well known virtualization technology. It can run on our desktop systems (Windows and Linux) or be directly installed on the base hardware.
  • XEN
    An open source virtualization technology. It is used by several other environments like Amazon EC2 (more on this later)
  • SUN's VirtualBox
    It's another x86 virtualization product which runs on Windows, Linux, Mac OSX and OpenSolaris
For performance reasons, usually, the virtualization technologies just create virtual machines of the same architecture as the base system. This means the CPU type is generally the same. Emulating other kinds of CPUs, although technically possible, imposes a serious performance overhead. Also, current CPU technologies include support for virtualization directly on the chips. It's perfectly possible to do it without hardware support, but it's slower. The main issue is that any machine code instruction that tries to access the hardware directly has to be intercepted. If the virtualization system (hypervisor) didn't do it, you'd have conflicts between the different virtual machines running on the same host.

So... Why would we want to virtualize? Well, several reasons for several uses:
  • Many hardware resources are used below it's capacities. Virtualization allows the sharing of the same resources (CPU, memory, network and storage) for usage in different (and isolated) machines. This leads to cost optimization
  • It's much easier to create a virtual machine on top of an existing hardware box, than to physically purchase, connect, install and manage a real machine
  • Due to the two reasons above, a virtual machine can be a great environment to support several activities like testing, learning and training, developing, demoing etc.
  • It's relatively easy to "shutdown" a virtual machine on one host, and "turn it on" on another hardware box. Latest versions of virtualization products sometimes even support "live" migration of virtual machines between different hardware boxes. This can become a real advantage in terms of system availability (without extra cost, like clustering, redundancy etc.)
  • It's possible to dynamically balance the physical resources (CPU capacity, disk and even memory) of the physical host between the virtual environments it supports. This means that different virtual machines with distinct usage cycles can co-exist on the same hardware box, and you can configure the resources to move between the virtualized hosts whenever their needs change
Ok. The above can give you an overview of the virtualization technologies and why you would want to use them. Now let's dig into the Informix related stuff. The first questions would be: Should you use Informix in virtualized environments? Does it work well? Does IBM support it? Does IBM provide flexible pricing to match the flexibility in these environments?

Well, the answer to all these questions could be a simple "yes". Let's see:

  • Informix architecture, usually referred to as Dynamic Scalable Architecture (DSA) is a perfet fit for virtualized environments. Informix implements the concept of virtual CPU in a operating system process. These CPUs then run user and system threads. This explains why it's so light. These virtual CPUs (CPU VP in informix jargon), can be added and removed dynamically. So, from the begining of IDS (when DSA was introduced) you can effectively dynamically adjust the CPU resources of your intance. Memory can also grow, and shrink. But I have to grant that it would be nice to see some improvments here. In practice it's very difficult to be able to shrink the memory once it grew.
    But the small footprint (both of installation and running resources) and dynamic resource adjustment are nice features for virtualized systems.
  • Regarding support, you can be confident that IDS is supported in these environments. There are obvious questions regarding performance issues, but you will not get the dreadful answer of "your setup is not supported" in case you need help from tech support.
  • Finally, IBM pricing is well aware of the virtualization needs (assuming a CPU based license policy). You will only pay for the resources you attribute to your virtual host. Accordingly to a recent announcement your license fees will depend on your virtual hosts environment and not the underlying hardware (which is usually much bigger, as as such would be more expensive).
    IBM calls this license scheme for virtualized environments "sub-capacity", alluding to the fact that you're running a virtual host with less capacity than the base hardware.
    If you want to license for concurrent session, than this is just like in any other (non virtualized) environment
Virtual appliance with IDS developer edition

IBM announced some time ago the availability of an IDS Developer Edition based virtual appliance. This is pre-installed and pre-configured VMWare image, running SUSE Linux Enterprise Server V11 and IDS 11.50. Everything is configured so you can easily deploy it and use it for testing, learning or developing purposes. Scripts are provided to create a full MACH-11 cluster and intructions are included to lead you through some demos. You just need a free product from VMWare to run it on your laptop. The appliance is available in 32 and 64 bit versions. You can access this virtual image in two ways:
When you first run the appliance, you'll go through some screens that allow you to make some configurations and also will prompt you for license acceptance. This process is fairly simple and will only take a couple of minutes. After that you'll see a normal Linux desktop with some shortcuts that will allow you to explore the power and simplicity of IDS.
This appliance is being constantly improved and updated by IBM. Current IDS version is 11.50.xC3, but you should expect 11.50.xC4 when available. I strongly recommend this appliance to anyone who wants to get familiar with Informix.


Amazon EC2 cloud


Cloud computing has become another buzz word of the IT industry. Large companies have large computing infra-structures. You can imagine that companies like IBM, SUN, Microsoft, Google, Yahoo, Amazon and so, on have large datacenters spread around the world. Like any other computer in the world, these datacenters are not always using it's full capacity. So, more, and more companies are trying to take advantage of some of their computing power, by making it available to customers as services. This resources are "somewhere" on the Internet. That's why the term "cloud" is used. Customers only have to know how to use these resources. They don't need to know how they're implemented or where they are located. You as a customer, pay a certain fee to use a determined amount of computing resources.
Amazon was one of the first companies to sell cloud computing type of services.. It started around 2006 selling an infra-structure where customers could implement web services. Later it introduced the EC2 (Elastic Computing) concept. The idea here is to rent virtual machines (Linux or Windows) to anybody who needs them. And you pay only what you use at the rate of $0.1 / hour for what Amazon calls a "small instance". This is "equivalent CPU capacity of a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor" and has 1.7GB of RAM. Not a big server, but perfectly enough for some tests or studying. You can also rent bigger instances, and you can rent several of them.
So, the term "elastic" means you can rent the resources you need and grow them as your needs grow. And you won't have to pay for physical allocation and equipment.

Now, why am I talking about this? Simply because IBM made the same virtual appliance I wrote about above, available as an AMI (Amazon Machine Image). This means you can rent an Amazon instance running IDS 11.50.UC3 (32 bit only for now) on top of SUSE Enterprise Linux.
To be honest, I was a bit lost with all these concepts, so I decided to test this myself. I've followed the following steps:
  1. I went to Guy Bowerman's blog to search for info
  2. I got hold of the IBM Informix Server Amazon Machine Image (AMI) Get Starting Guide
  3. I went to http://aws.amazon.com/ec2/ and sign up. After login you'll have the access keys and an X.509 certificate (private and public key). These are used to identify you when calling Amazon web services (which implement the Amazon managing API). So you should download them into your local system (as explained in the Getting Started Guide)
  4. The next step is to "buy" the AMI of the IDS Developer Edition. I put "buy" between quotes, because although you have to put on a purchase order, in reality you will not have to pay any licensing fees. You'll just pay the use of it, at the standard Amazon small instance rate of $0.1/hour. This step and the URLs are perfectly documented in the guide
  5. The next step involves downloading and setting up an installation of the Amazon EC2 API (command line) tools. These are implemented in Java, which means two things: You'll need a Java (JRE) environment on you system, and you can run them on Windows, Unix and Linux. During the setup process it is suggested that you create another key pair that will be used to authenticate your logons to the instance.
  6. Then, instructions are provided in order to launch an Amazon instance based on the IDS Developer AMI that you "purchased" earlier. Detailed instructions are included so that you can access the running instance using an SSH connection. Remember that the authentication will be done through a pair of keys you generated a few steps ago.
  7. After you login to the instance you'll get through a similar process that the IDS virtual appliance also provides. Besides the common licencing acceptance, in this environment you'll also be prompted for:
    1. The keypair you generated (it's suggested that you copy the files and just point to them)
    2. The user's passwords (root, informix and developer)
    3. The configuration of a persistent storage.
      I should have wrote about this earlier... The AMI instances are volatile. This means that once they're stoped all their "local" storage is gone. So, you should allocate a permanent storage from Amazon EBS service (extra charge of around $0.1/GB/Month). This storage volume can be mounted in /data by the IDS Developer instance. I'll get back to this topic below.

So, after this steps I got a SUSE Enteprise Linux, running IDS Developer Edition, with a MACH 11 cluster already configured, running somewhere, in the Amazon Cloud, available for me (and anyone I want) to connect to. How much did it cost? Around $0.35, including an EBS storage volume.
Please note that IBM didn't just made an IDS Developer AMI available. IBM also established a policy for licensing Informix (and other IBM software) on the Amazon Cloud Computing platform. The relevant announcements are here; http://www-03.ibm.com/press/us/en/pressrelease/26673.wss and here; http://www-01.ibm.com/software/lotus/passportadvantage/pvu_for_Amazon_Elastic_compute_cloud.html (Processor Value Units - PVUs - for Amazon EC2 )

So, isn't this a perfect way to test software, or to create temporary machines for propotype developing, or for the purpose of distance teaching etc. ? Yes... But I feel there's a small issue:
As stated above, you pay what you use. This means that you pay for as long as your instances are running. Obviously, for saving money, you'll want to stop them when they're not used. But the instances are volatile. Meaning that it's not exactly like a VMWare image. When you restart them you'll get the AMI initial image, and not the machine's state when you shut it down. That's why Amazon provides the EBS volumes. These are permanent, non-volatile storage volumes. As mentioned in the getting started guide, you should keep you database files in these volumes. But even so, if you restart the instance, you'll have to go through the setup screens again. This is not convenient. But there is a simple solution for this: Private AMIs.

When you're running an instance, you can decide to make an AMI from it. The process is called "bundle" it. You can get the details on how to do it here: http://docs.amazonwebservices.com/AWSEC2/latest/DeveloperGuide/bundling-an-ami.html After you create a bundle from a running instance, you can upload it. This will make a new AMI available for you. It's called a private AMI. You can also make it available to the public.
After this you can launch an instance from your own AMI. So theoritically you could customize the IDS Developer AMI, bundle it, upload it as a private AMI and use it to launch your customized instances. You'd have to check the licenses though...

So, in short, in which scenarios could we use Amazon EC2, and more specifically the IDS Developer AMI?
  • You need some machine for a team of developers to work on a new project during a short period of time
    It's easy to setup and use. And you'll know how much it will cost you. And you don't have to depend on your own resources
  • You need to make a customer demo for an application you developed. You just install it, and use it at your customer site. Better yet, your customer can make it's own testing even after you leave
  • You want to provide some application training remotely (or long distance). Again, just install it, give the access details to your students, and there you go...
  • You want to learn about IDS and you don't want to install the virtual appliance locally (you don't have the necessary resources for running it)
  • And of course, you have a startup company, and you don't want to own your own datacenter. So you just rent it... In this scenario you would need payed IDS licensees of course....
Summary
In this long post I've gone through the following points:
  • Why IDS is a perfect match for virtualized environments
  • IBM Informix virtual appliance. A pre-configured VMWare image with IDS Developer Edition already installed. Everything ready for your experiments
  • IBM Developer Edition AMI (Amazon Machine Image). The machine image in Amazon EC2 format that IBM made available for use in Amazon EC2 environment
I haven't gone into details of the virtual appliance contents. But I recommend that if you're interested in IBM Informix Dynamic Server, you should really test it. It probably has everything you'll need to learn and test IDS.

Glossary
  • Amazon EC2
    A cloud computing environment run by Amazon
  • AMI
    Amazon Machine Image - A pre-built virtual machine that you can use to start an Amazon EC2 instance
  • Amazon EC2 instance
    A running virtual machine in the Amazon EC2 environment
  • Amazon S3
    Amazon's Simple Storage Service
    This is a non-volatile storage service provided by Amazon. It costs around $0.1/GB/Month
  • IDS Developer Edition
    A version of IBM Informix Dynamic Server, that you can use for application developing.
    It's freely available at and you can use it for learning, test and application developing. Please check the license for details
  • VMWare Appliances
    Pre-configured virtual images ready to run in one of the VWWare products (IDS Developer image is available)
References