Wanted to share this as I know a few others on here that use HP Z-600's. Since the later HP Workstations... 620 and 640 have turned to crap, I have kept my Z-600 (ver 3) for longer than I usually would. A month ago I started getting a RAM error at the end of boot up, but also an F1 which allowed it to continue and operate without that 8 GB in use. First thing I tried was swapping RAM around between the indicated bad position and the one next to it. Still had the same error. Then I swapped the CPU's around, but same thing again. So that left the motherboard to replace. Found a brand new Ver 3 motherboard and it's now working great again. Anyway, if you ever need to swap out a motherboard be sure you know which you have by looking at either the part number tag on the motherboard itself which ends in 001, 2 or 3, or the bootblock date in he BIOS.
Announcement
Collapse
No announcement yet.
HP Z-600 RAM Error At Boot Up
Collapse
X
-
Do you know what the RAM error was?
Your computer uses ECC RAM, as many computers do, today. As you know, ECC stands for "Error Correction Code" which tries to compensate for random bit-flips in memory. If there is a bit-flip error that ECC can't correct, the computer will throw a "Bad RAM" flag.
My computer uses ECC RAM and I got an older computer from a friend of mine which has compatible RAM with my computer. One of the RAM sticks started throwing ECC errors and I had to figure out what was causing it.
When my computer starts getting RAM errors like that, it automatically disconnects that RAM slot and keeps running, mostly normally, provided that there is still enough good RAM to keep the computer going. Since I've got all eight RAM slots full for almost 24 GB, I can lose a slot but hardly notice.
After much dicking around, I finally discovered the source of the problem. My computer uses riser boards to hold the ram and one of the slots on one of the risers was bad.
Like you, I started swapping RAM around between the two boards but the problem always stayed with that one riser.
I bought a new riser board from Ebay and I solved the problem for a little over $50.00.
Haven't had a problem, since.Last edited by Randy Stankey; 01-02-2023, 03:40 PM.
- Likes 1
-
Yes, it is ECC RAM... The easiest place to start was by swapping the RAM, CPU Swap is more involved as I'm sure you know. Also, over at the HP Tech Site there were posts from others that had the same or similar issue. But the solution was not always the same... but, it was always one of the three things I mentioned in the first post. I was lucky to find the NOS Ver 3 motherboard, as it is the only one that can take the Hex Core Intel's, and the RAM sockets are the standard type. I think this problem had to be either a bad socket, or the traces on board itself failed. The cool thing about these computers is no tools are required to do any repair job on it.... except the CPU heatsinks for which there is a Torx wrench that is clipped inside the case.
P.S., Windows displayed the full amount of memory, but showed a lesser usable amount...You do not have permission to view this gallery.
This gallery has 2 photos.Last edited by Mark Gulbrandsen; 01-02-2023, 04:45 PM.
Comment
-
From my time working on an SMT line, one of the most common sources of failures is bad solder joints. BGAs and other "flat-pack" type components are notorious for this. You can't see under the components to inspect them, either. Because of this, some circuit boards were required to be X-rayed to be sure that all the joints were good.
When bad solder joints were found, they were sent to the rework department where they would be reheated and the suspect components would be pressed down.
As you know, lead-free solder degrades with heat cycling, over time, and can form dendrites, causing failures years after the board is put into use.
Again, in the SMT shop, if something like that is found, the solution is to reheat and reseat.
Since you already have a replacement board, it wouldn't be hard for you to give that old MoBo the heat-n-beat treatment. Stick it in an oven at 240º C until the solder gets soft. If there are any bad solder joints, they MIGHT just reflow by themselves. I've seen it work in the shop. They just stick them back through the conveyor oven.
- Likes 1
Comment
-
Originally posted by Randy Stankey View PostFrom my time working on an SMT line, one of the most common sources of failures is bad solder joints. BGAs and other "flat-pack" type components are notorious for this. You can't see under the components to inspect them, either. Because of this, some circuit boards were required to be X-rayed to be sure that all the joints were good.
When bad solder joints were found, they were sent to the rework department where they would be reheated and the suspect components would be pressed down.
As you know, lead-free solder degrades with heat cycling, over time, and can form dendrites, causing failures years after the board is put into use.
Again, in the SMT shop, if something like that is found, the solution is to reheat and reseat.
Since you already have a replacement board, it wouldn't be hard for you to give that old MoBo the heat-n-beat treatment. Stick it in an oven at 240º C until the solder gets soft. If there are any bad solder joints, they MIGHT just reflow by themselves. I've seen it work in the shop. They just stick them back through the conveyor oven.
Comment
-
Never having done much SMT except for building audio DAC's, baking the bad MB never occurred to me. The DAC's I built were all hand soldered... It's also a lot of work, but certainly not difficult to switch the mother board, CPU's, etc. The new one works fine, so I am just going to leave well enough alone. Probably only going to use it a few more years before replacing it with some much newer work station. But it won't be an HP as they have really turned to crap...
Comment
-
The Christie ACTs have this same problem with the BGA solder joints on the “Coldfire” CPU chip. When powered on the unit just sat there and did nothing more than show me the green light up buttons. Placing a piece of tin foil on my gas stove top and setting its burner on the lowest setting, I created a tinfoil chute over this chip. I then proceeded to heat the top of the chip with my heat gun for somewhere in the neighborhood of 20 minutes while the stove kept heat on the bottom. I let the board cool down and installed it back in the ACT. So far the unit is functioning. Your mileage may vary. It seems these are subject to the same crappy lead free solder balls. That and a combination of lack of cooling. At least heat sink wouldn’t have been a bad idea on the chip that is in charge of everything inside an electronic device. The ACT even has electronic provisions for a fan and a place to physically mount one.
Comment
-
Yea, Randy suggested I try that, but the board is pretty large, and a number of IC's are mounted on the bottom.It would have been a good excuse to buy a heat gun too. But the replacement board was cheap at $35. So I just replaced the board and tossed the bad one.
On the Christie, I think it will likely keep running. Some places seem to run the solder flow machine a little to fast resulting in your problem.
Comment
-
Originally posted by Mark Gulbrandsen View PostSome places seem to run the solder flow machine a little to fast resulting in your problem.
Like Mark says, I've seen people run jobs through an oven without checking the recipe settings (using a generic recipe) or else not verifying with a mole. It takes time, up to fifteen or thirty minutes for the oven to stabilize temperature before you can run parts through. They think that cutting corners like this will save them a few minutes but it's a false economization. The minute saved, very often, results in bad product and increased rework. People seem to ignore the fact that, in manufacturing, every dollar (or unit of work) lost actually costs you three. You lose one unit of work/value for the part damaged. You lose a second unit of work to fix the damaged part. Plus, you lose a third unit making up for the work/value that you could have made if you didn't have to fix a damaged part in the first place.
Yeah, I get it. In manufacturing, you need to make as many parts, as quickly as you can, as well as you can, in the shortest amount of time that's practical. The rub is, if you fuck up one part by trying to save a few minutes, you could have spent those few minutes making three good ones.
That's where a lot of people make the big mistake. They try to cut corners to save a penny when rework will cost them three.
- Likes 2
Comment
Comment